GNU C Library (libc) Programming Guide - Restartable multibyte conversion

Next: Non-reentrant Conversion, Previous: Charset Function Overview, Up: Character Set Handling

6.3 Restartable Multibyte Conversion Functions

The ISO C standard defines functions to convert strings from a multibyte representation to wide character strings. There are a number of peculiarities:

The character set assumed for the multibyte encoding is not specified as an argument to the functions. Instead the character set specified by the LC_CTYPE category of the current locale is used; see Locale Categories.
The functions handling more than one character at a time require NUL terminated strings as the argument (i.e., converting blocks of text does not work unless one can add a NUL byte at an appropriate place). The GNU C library contains some extensions to the standard that allow specifying a size, but basically they also expect terminated strings.

Despite these limitations the ISO C functions can be used in many contexts. In graphical user interfaces, for instance, it is not uncommon to have functions that require text to be displayed in a wide character string if the text is not simple ASCII. The text itself might come from a file with translations and the user should decide about the current locale, which determines the translation and therefore also the external encoding used. In such a situation (and many others) the functions described here are perfect. If more freedom while performing the conversion is necessary take a look at the iconv functions (see Generic Charset Conversion).