Follow Techotopia on Twitter

On-line Guides
All Guides
eBook Store
iOS / Android
Linux for Beginners
Office Productivity
Linux Installation
Linux Security
Linux Utilities
Linux Virtualization
Linux Kernel
System/Network Admin
Scripting Languages
Development Tools
Web Development
GUI Toolkits/Desktop
Mail Systems
Eclipse Documentation

How To Guides
General System Admin
Linux Security
Linux Filesystems
Web Servers
Graphics & Desktop
PC Hardware
Problem Solutions
Privacy Policy




Chapter 11. Implementation Details

Here we document details of how the preprocessor's implementation affects its user-visible behavior. You should try to avoid undue reliance on behavior described here, as it is possible that it will change subtly in future implementations.

Also documented here are obsolete features and changes from previous versions of CPP.

11.1. Implementation-defined behavior

This is how CPP behaves in all the cases which the C standard describes as implementation-defined. This term means that the implementation is free to do what it likes, but must document its choice and stick to it.

  • The mapping of physical source file multi-byte characters to the execution character set.

    Currently, CPP requires its input to be ASCII or UTF-8. The execution character set may be controlled by the user, with the -ftarget-charset and -ftarget-wide-charset options.

  • Identifier characters.

    The C and C++ standards allow identifiers to be composed of _ and the alphanumeric characters. C++ and C99 also allow universal character names (not implemented in GCC), and C99 further permits implementation-defined characters.

    GCC allows the $ character in identifiers as an extension for most targets. This is true regardless of the std= switch, since this extension cannot conflict with standards-conforming programs. When preprocessing assembler, however, dollars are not identifier characters by default.

    Currently the targets that by default do not permit $ are AVR, IP2K, MMIX, MIPS Irix 3, ARM aout, and PowerPC targets for the AIX and BeOS operating systems.

    You can override the default with -fdollars-in-identifiers or fno-dollars-in-identifiers.

  • Non-empty sequences of whitespace characters.

    In textual output, each whitespace sequence is collapsed to a single space. For aesthetic reasons, the first token on each non-directive line of output is preceded with sufficient spaces that it appears in the same column as it did in the original source file.

  • The numeric value of character constants in preprocessor expressions.

    The preprocessor and compiler interpret character constants in the same way; i.e. escape sequences such as \a are given the values they would have on the target machine.

    The compiler values a multi-character character constant a character at a time, shifting the previous value left by the number of bits per target character, and then or-ing in the bit-pattern of the new character truncated to the width of a target character. The final bit-pattern is given type int, and is therefore signed, regardless of whether single characters are signed or not (a slight change from versions 3.1 and earlier of GCC). If there are more characters in the constant than would fit in the target int the compiler issues a warning, and the excess leading characters are ignored.

    For example, 'ab' for a target with an 8-bit char would be interpreted as (int) ((unsigned char) 'a' * 256 + (unsigned char) 'b'), and '\234a' as (int) ((unsigned char) '\234' * 256 + (unsigned char) 'a').

  • Source file inclusion.

    For a discussion on how the preprocessor locates header files, Section 2.2 Include Operation.

  • Interpretation of the filename resulting from a macro-expanded #include directive.

    Section 2.5 Computed Includes.

  • Treatment of a #pragma directive that after macro-expansion results in a standard pragma.

    No macro expansion occurs on any #pragma directive line, so the question does not arise.

    Note that GCC does not yet implement any of the standard pragmas.

  Published under the terms of the GNU General Public License Design by Interspire