Jump to content

C Programming/C trigraph

From Wikibooks, open books for an open world

Trigraphs

[edit | edit source]

Trigraphs have been removed from C in the version C23.[1] For older versions (and they aren't much used except in some old code for mainframes, meaning dropping them wasn't very breaking in practice):

C was designed in English and assumes the common ASCII character set made for English, which includes such characters as {, }, [, ], and so on. Some other character sets (like EBCDIC on mainframes), however, do not have these or other characters which are required by C. To solve this problem, the 1989 C standard in section 5.2.1.1 defined a set of trigraph sequences which can be substitutes for the symbols and which will work in any situation. In fact, the first translation phase of compilation specified in the 1989 C standard (section 5.1.1.2) is to replace the trigraph sequences with their corresponding single-character equivalents.

The following trigraph sequences exist, and no other. Each question mark ? that does not begin one of the trigraph sequences listed is not changed.

Sequence Replacement
======== ===========
  ??=         #
  ??(         [
  ??/         \
  ??)         ]
  ??'         ^
  ??<         {
  ??!         |
  ??>         }
  ??-         ~

The effect of this is that statements such as

printf ("Eh???/n");

will, after the trigraph is replaced, be the equivalent of

printf ("Eh?\n");

Should the programmer want the trigraph not to be replaced, within strings and character constants (which is the only place they would need replacing and it would change things), the programmer can simply escape the second question mark; e.g.

 printf ("Two question marks in a row: ?\?!\n");

The 1999 C standard added these punctuators, sometimes called digraphs, in section 6.4.6. They are equivalent to the following tokens except for their spelling:

Digraph Equivalent
======= ==========
   <:       [
   :>       ]
   <%       {
   %>       }
   %:       #
  %:%:      ##

In other words, they behave differently when stringized as part of a macro replacement, but are otherwise equivalent.

References

[edit | edit source]