Follow Techotopia on Twitter

On-line Guides
All Guides
eBook Store
iOS / Android
Linux for Beginners
Office Productivity
Linux Installation
Linux Security
Linux Utilities
Linux Virtualization
Linux Kernel
System/Network Admin
Programming
Scripting Languages
Development Tools
Web Development
GUI Toolkits/Desktop
Databases
Mail Systems
openSolaris
Eclipse Documentation
Techotopia.com
Virtuatopia.com
Answertopia.com

How To Guides
Virtualization
General System Admin
Linux Security
Linux Filesystems
Web Servers
Graphics & Desktop
PC Hardware
Windows
Problem Solutions
Privacy Policy

  




 

 

27.2.5.2. Type-Specific Plugin Structures and Functions

In the st_mysql_plugin structure that defines a plugin's general declaration, the info member points to a type-specific plugin descriptor. For a full-text parser plugin, the descriptor corresponds to the st_mysql_ftparser structure in the plugin.h file:

struct st_mysql_ftparser
{
  int interface_version;
  int (*parse)(MYSQL_FTPARSER_PARAM *param);
  int (*init)(MYSQL_FTPARSER_PARAM *param);
  int (*deinit)(MYSQL_FTPARSER_PARAM *param);
};

As shown by the structure definition, the descriptor has a version number (MYSQL_FTPARSER_INTERFACE_VERSION for full-text parser plugins) and contains pointers to three functions. The init and deinit members should point to a function or be set to 0 if the function is not needed. The parse member must point to the function that performs the parsing.

A full-text parser plugin is used in two different contexts, indexing and searching. In both contexts, the server calls the initialization and deinitialization functions at the beginning and end of processing each SQL statement that causes the plugin to be invoked. However, during statement processing, the server calls the main parsing function in context-specific fashion:

  • For indexing, the server calls the parser for each column value to be indexed.

  • For searching, the server calls the parser to parse the search string. The parser might also be called for rows processed by the statement. In natural language mode, there is no need for the server to call the parser. For boolean mode phrase searches or natural language searches with query expansion, the parser is used to parse column values for information that is not in the index. Also, if a boolean mode search is done for a column that has no FULLTEXT index, the built-in parser will be called. (Plugins are associated with specific indexes. If there is no index, no plugin is used.)

Note that the plugin declaration in the plugin library descriptor has initialization and deinitialization functions, and so does the plugin descriptor to which it points. These pairs of functions have different purposes and are invoked for different reasons:

  • For the plugin declaration in the plugin library descriptor, the initialization and deinitialization functions are invoked when the plugin is loaded and unloaded.

  • For the plugin descriptor, the initialization and deinitialization functions are invoked per SQL statement for which the plugin is used.

Each interface function named in the plugin descriptor should return zero for success or non-zero for failure, and each of them receives an argument that points to a MYSQL_FTPARSER_PARAM structure containing the parsing context. The structure has this definition:

typedef struct st_mysql_ftparser_param
{
  int (*mysql_parse)(void *param, byte *doc, uint doc_len);
  int (*mysql_add_word)(void *param, byte *word, uint word_len,
                        MYSQL_FTPARSER_BOOLEAN_INFO *boolean_info);
  void *ftparser_state;
  void *mysql_ftparam;
  CHARSET_INFO *cs;
  byte *doc;
  uint length;
  int mode;
} MYSQL_FTPARSER_PARAM;

The structure members are used as follows:

  • mysql_parse

    A pointer to a callback function that invokes the server's built-in parser. Use this callback when the plugin acts as a front end to the built-in parser. That is, when the plugin parsing function is called, it should process the input to extract the text and pass the text to the mysql_parse callback.

    The first parameter for this callback function should be the mysql_ftparam member of the parsing context structure. That is, if param points to the structure, invoke the callback like this:

    param->mysql_parse(param->mysql_ftparam, ...);
    

    A front end plugin can extract text and pass it all at once to the built-in parser, or it can extract and pass text to the built-in parser a piece at a time. However, in this case, the built-in parser treats the pieces of text as though there are implicit word breaks between them.

  • mysql_add_word

    A pointer to a callback function that adds a word to a full-text index or to the list of search terms. Use this callback when the parser plugin replaces the built-in parser. That is, when the plugin parsing function is called, it should parse the input into words and invoke the mysql_add_word callback for each word.

    The first parameter for this callback function should be the mysql_ftparam member of the parsing context structure. That is, if param points to the structure, invoke the callback like this:

    param->mysql_add_word(param->mysql_ftparam, ...);
    
  • ftparser_state

    This is a generic pointer. The plugin can set it to point to information to be used internally for its own purposes.

  • mysql_ftparam

    This is set by the server. It is passed as the first argument to the mysql_parse or mysql_add_word callback.

  • cs

    A pointer to information about the character set of the text, or 0 if no information is available.

  • doc

    A pointer to the text to be parsed.

  • length

    The length of the text to be parsed, in bytes.

  • mode

    The parsing mode. This value will be one of the folowing constants:

    • MYSQL_FTPARSER_SIMPLE_MODE

      Parse in fast and simple mode, which is used for indexing and for natural language queries. The parser should pass to the server only those words that should be indexed. If the parser uses length limits or a stopword list to determine which words to ignore, it should not pass such words to the server.

    • MYSQL_FTPARSER_WITH_STOPWORDS

      Parse in stopword mode. This is used in boolean searches for phrase matching. The parser should pass all words to the server, even stopwords or words that are outside any normal length limits.

    • MYSQL_FTPARSER_FULL_BOOLEAN_INFO

      Parse in boolean mode. This is used for parsing boolean query strings. The parser should recognize not only words but also boolean-mode operators and pass them to the server as tokens via the mysql_add_word callback. To tell the server what kind of token is being passed, the plugin needs to fill in a MYSQL_FTPARSER_BOOLEAN_INFO structure and pass a pointer to it.

If the parser is called in boolean mode, the param->mode value will be MYSQL_FTPARSER_FULL_BOOLEAN_INFO. The MYSQL_FTPARSER_BOOLEAN_INFO structure that the parser uses for passing token information to the server looks like this:

typedef struct st_mysql_ftparser_boolean_info
{
  enum enum_ft_token_type type;
  int yesno;
  int weight_adjust;
  bool wasign;
  bool trunc;
  /* These are parser state and must be removed. */
  byte prev;
  byte *quot;
} MYSQL_FTPARSER_BOOLEAN_INFO;

The parser should fill in the structure members as follows:

  • type

    The token type. This should be one of values shown in the following table:

    Type Meaning
    FT_TOKEN_EOF End of data
    FT_TOKEN_WORD A regular word
    FT_TOKEN_LEFT_PAREN The beginning of a group or subexpression
    FT_TOKEN_RIGHT_PAREN The end of a group or subexpression
    FT_TOKEN_STOPWORD A stopword
  • yesno

    Whether the word must be present for a match to occur. 0 means that the word is optional but increases the match relevance if it is present. Values larger than 0 mean that the word must be present. Values smaller than 0 mean that the word must not be present.

  • weight_adjust

    A weighting factor that determines how much a match for the word counts. It can be used to increase or decrease the word's importance in relevance calculations. A value of zero indicates no weight adjustment. Values greater than or less than zero mean higher or lower weight, respectively. The examples at Section 12.7.1, “Boolean Full-Text Searches”, that use the < and > operators illustrate how weighting works.

  • wasign

    The sign of the weighting factor. A negative value acts like the ~ boolean-search operator, which causes the word's contribution to the relevance to be negative.

  • trunc

    Whether matching should be done as if the boolean-mode * truncation operator had been given.

Plugins should not use the prev and quot members of the MYSQL_FTPARSER_BOOLEAN_INFO structure.


 
 
  Published under the terms of the GNU General Public License Design by Interspire