Previous: , Up: Input and Output   [Contents][Index]


6.14.15 Handling of Unicode Byte Order Marks

This section documents the finer points of Guile’s handling of Unicode byte order marks (BOMs). A byte order mark (U+FEFF) is typically found at the start of a UTF-16 or UTF-32 stream, to allow readers to reliably determine the byte order. Occasionally, a BOM is found at the start of a UTF-8 stream, but this is much less common and not generally recommended.

Guile attempts to handle BOMs automatically, and in accordance with the recommendations of the Unicode Standard, when the port encoding is set to UTF-8, UTF-16, or UTF-32. In brief, Guile automatically writes a BOM at the start of a UTF-16 or UTF-32 stream, and automatically consumes one from the start of a UTF-8, UTF-16, or UTF-32 stream.

As specified in the Unicode Standard, a BOM is only handled specially at the start of a stream, and only if the port encoding is set to UTF-8, UTF-16 or UTF-32. If the port encoding is set to UTF-16BE, UTF-16LE, UTF-32BE, or UTF-32LE, then BOMs are not handled specially, and none of the special handling described in this section applies.


Previous: , Up: Input and Output   [Contents][Index]