Follow Techotopia on Twitter

On-line Guides
All Guides
eBook Store
iOS / Android
Linux for Beginners
Office Productivity
Linux Installation
Linux Security
Linux Utilities
Linux Virtualization
Linux Kernel
System/Network Admin
Scripting Languages
Development Tools
Web Development
GUI Toolkits/Desktop
Mail Systems
Eclipse Documentation

How To Guides
General System Admin
Linux Security
Linux Filesystems
Web Servers
Graphics & Desktop
PC Hardware
Problem Solutions
Privacy Policy




5.10.3. Adding a New Character Set

This section discusses the procedure for adding a new character set to MySQL. You must have a MySQL source distribution to use these instructions. To choose the proper procedure, determine whether the character set is simple or complex:

  • If the character set does not need to use special string collating routines for sorting and does not need multi-byte character support, it is simple.

  • If it needs either of those features, it is complex.

For example, latin1 and danish are simple character sets, whereas big5 and czech are complex character sets.

In the following instructions, the name of the character set is represented by MYSET.

For a simple character set, do the following:

  1. Add MYSET to the end of the sql/share/charsets/Index file. Assign a unique number to it.

  2. Create the file sql/share/charsets/MYSET.conf. (You can use a copy of sql/share/charsets/latin1.conf as the basis for this file.)

    The syntax for the file is very simple:

    • Comments start with a ‘#’ character and continue to the end of the line.

    • Words are separated by arbitrary amounts of whitespace.

    • When defining the character set, every word must be a number in hexadecimal format.

    • The ctype array takes up the first 257 words. The to_lower[], to_upper[] and sort_order[] arrays take up 256 words each after that.

    See Section 5.10.4, “The Character Definition Arrays”.

  3. Add the character set name to the CHARSETS_AVAILABLE and COMPILED_CHARSETS lists in

  4. Reconfigure, recompile, and test.

For a complex character set, do the following:

  1. Create the file strings/ctype-MYSET.c in the MySQL source distribution.

  2. Add MYSET to the end of the sql/share/charsets/Index file. Assign a unique number to it.

  3. Look at one of the existing ctype-*.c files (such as strings/ctype-big5.c) to see what needs to be defined. Note that the arrays in your file must have names like ctype_MYSET, to_lower_MYSET, and so on. These correspond to the arrays for a simple character set. See Section 5.10.4, “The Character Definition Arrays”.

  4. Near the top of the file, place a special comment like this:

     * This comment is parsed by configure to create ctype.c,
     * so don't change it unless you know what you are doing.
     * .configure. number_MYSET=MYNUMBER
     * .configure. strxfrm_multiply_MYSET=N
     * .configure. mbmaxlen_MYSET=N

    The configure program uses this comment to include the character set into the MySQL library automatically.

    The strxfrm_multiply and mbmaxlen lines are explained in the following sections. You need include them only if you need the string collating functions or the multi-byte character set functions, respectively.

  5. You should then create some of the following functions:

    • my_strncoll_MYSET()

    • my_strcoll_MYSET()

    • my_strxfrm_MYSET()

    • my_like_range_MYSET()

    See Section 5.10.5, “String Collating Support”.

  6. Add the character set name to the CHARSETS_AVAILABLE and COMPILED_CHARSETS lists in

  7. Reconfigure, recompile, and test.

The sql/share/charsets/README file includes additional instructions.

If you want to have the character set included in the MySQL distribution, mail a patch to the MySQL internals mailing list. See Section 1.7.1, “MySQL Mailing Lists”.

  Published under the terms of the GNU General Public License Design by Interspire