Next: , Previous: , Up: Dictionary for Unicode Concepts   [Contents][Index]


13.3.5 Reader

To support Unicode characters, the reader has been extended to recognize characters written in hexadecimal. Thus #\U+41 is the ASCII capital letter “A”, since 41 is the hexadecimal code for that letter. The Unicode name of the character is also recognized, except spaces in the name are replaced by underscores.

Recall, however, that characters in CMUCL are only 16 bits long so many Unicode characters cannot be represented. However, strings can represent all Unicode characters.

When symbols are read, the symbol name is converted to Unicode NFC form before interning the symbol into the package. Hence, symbol-name (intern ``string'') may produce a string that is not string= to “string”. However, after conversion to NFC form, the strings will be identical.