Standard Character Sets (Guile Reference Manual)

6.6.4.6 Standard Character Sets

In order to make the use of the character set data type and procedures useful, several predefined character set variables exist.

These character sets are locale independent and are not recomputed upon a setlocale call. They contain characters from the whole range of Unicode code points. For instance, char-set:letter contains about 100,000 characters.

Scheme Variable: char-set:lower-case
C Variable: scm_char_set_lower_case: All lower-case characters.

Scheme Variable: char-set:upper-case
C Variable: scm_char_set_upper_case: All upper-case characters.

Scheme Variable: char-set:title-case
C Variable: scm_char_set_title_case: All single characters that function as if they were an upper-case letter followed by a lower-case letter.

Scheme Variable: char-set:letter
C Variable: scm_char_set_letter: All letters. This includes char-set:lower-case, char-set:upper-case, char-set:title-case, and many letters that have no case at all. For example, Chinese and Japanese characters typically have no concept of case.

Scheme Variable: char-set:digit
C Variable: scm_char_set_digit: All digits.

Scheme Variable: char-set:letter+digit
C Variable: scm_char_set_letter_and_digit: The union of char-set:letter and char-set:digit.

Scheme Variable: char-set:graphic
C Variable: scm_char_set_graphic: All characters which would put ink on the paper.

Scheme Variable: char-set:printing
C Variable: scm_char_set_printing: The union of char-set:graphic and char-set:whitespace.

Scheme Variable: char-set:whitespace
C Variable: scm_char_set_whitespace: All whitespace characters.

Scheme Variable: char-set:blank
C Variable: scm_char_set_blank: All horizontal whitespace characters, which notably includes #\space and #\tab.

Scheme Variable: char-set:iso-control
C Variable: scm_char_set_iso_control: The ISO control characters are the C0 control characters (U+0000 to U+001F), delete (U+007F), and the C1 control characters (U+0080 to U+009F).

Scheme Variable: char-set:punctuation
C Variable: scm_char_set_punctuation: All punctuation characters, such as the characters !"#%&'()*,-./:;?@[\\]_{}

Scheme Variable: char-set:symbol
C Variable: scm_char_set_symbol: All symbol characters, such as the characters $+<=>^`|~.

Scheme Variable: char-set:hex-digit
C Variable: scm_char_set_hex_digit: The hexadecimal digits 0123456789abcdefABCDEF.

Scheme Variable: char-set:ascii
C Variable: scm_char_set_ascii: All ASCII characters.

Scheme Variable: char-set:empty
C Variable: scm_char_set_empty: The empty character set.

Scheme Variable: char-set:designated
C Variable: scm_char_set_designated: This character set contains all designated code points. This includes all the code points to which Unicode has assigned a character or other meaning.

Scheme Variable: char-set:full
C Variable: scm_char_set_full: This character set contains all possible code points. This includes both designated and reserved code points.