27.2 Enabling Multibyte Characters
You can enable or disable multibyte character support, either for
Emacs as a whole, or for a single buffer. When multibyte characters are
disabled in a buffer, then each byte in that buffer represents a
character, even codes 0200 through 0377. The old features for
supporting the European character sets, ISO Latin-1 and ISO Latin-2,
work as they did in Emacs 19 and also work for the other ISO 8859
However, there is no need to turn off multibyte character support to
use ISO Latin; the Emacs multibyte character set includes all the
characters in these character sets, and Emacs can translate
automatically to and from the ISO codes.
By default, Emacs starts in multibyte mode, because that allows you to
use all the supported languages and scripts without limitations.
To edit a particular file in unibyte representation, visit it using
find-file-literally. See Visiting. To convert a buffer in
multibyte representation into a single-byte representation of the same
characters, the easiest way is to save the contents in a file, kill the
buffer, and find the file again with
can also use C-x <RET> c
universal-coding-system-argument) and specify ‘raw-text’ as
the coding system with which to find or save a file. See Specify Coding. Finding a file as ‘raw-text’ doesn't disable format
conversion, uncompression and auto mode selection as
To turn off multibyte character support by default, start Emacs with
the ‘--unibyte’ option (see Initial Options), or set the
environment variable EMACS_UNIBYTE. You can also customize
enable-multibyte-characters or, equivalently, directly set the
your init file to have basically the same effect as ‘--unibyte’.
To convert a unibyte session to a multibyte session, set
t. Buffers which
were created in the unibyte session before you turn on multibyte support
will stay unibyte. You can turn on multibyte support in a specific
buffer by invoking the command
in that buffer.
With ‘--unibyte’, multibyte strings are not created during
initialization from the values of environment variables,
/etc/passwd entries etc. that contain non-ASCII 8-bit
Emacs normally loads Lisp files as multibyte, regardless of whether
you used ‘--unibyte’. This includes the Emacs initialization file,
.emacs, and the initialization files of Emacs packages such as
Gnus. However, you can specify unibyte loading for a particular Lisp
file, by putting ‘-*-unibyte: t;-*-’ in a comment on the first
line (see File Variables). Then that file is always loaded as
unibyte text, even if you did not start Emacs with ‘--unibyte’.
The motivation for these conventions is that it is more reliable to
always load any particular Lisp file in the same way. However, you can
load a Lisp file as unibyte, on any one occasion, by typing C-x
<RET> c raw-text <RET> immediately before loading it.
The mode line indicates whether multibyte character support is enabled
in the current buffer. If it is, there are two or more characters (most
often two dashes) before the colon near the beginning of the mode line.
When multibyte characters are not enabled, just one dash precedes the