Next: String Searching, Previous: String Modification, Up: Strings [Contents][Index]
The procedures in this section are similar to the character ordering predicates (see Characters), but are defined on character sequences.
The first set is specified in R5RS and has names that end in ?
.
The second set is specified in SRFI-13 and the names have not ending
?
.
The predicates ending in -ci
ignore the character case
when comparing strings. For now, case-insensitive comparison is done
using the R5RS rules, where every lower-case character that has a
single character upper-case form is converted to uppercase before
comparison. See See the (ice-9
i18n)
module, for locale-dependent string comparison.
Lexicographic equality predicate; return #t
if all strings are
the same length and contain the same characters in the same positions,
otherwise return #f
.
The procedure string-ci=?
treats upper and lower case
letters as though they were the same character, but
string=?
treats upper and lower case as distinct
characters.
Lexicographic ordering predicate; return #t
if, for every pair of
consecutive string arguments str_i and str_i+1, str_i is
lexicographically less than str_i+1.
Lexicographic ordering predicate; return #t
if, for every pair of
consecutive string arguments str_i and str_i+1, str_i is
lexicographically less than or equal to str_i+1.
Lexicographic ordering predicate; return #t
if, for every pair of
consecutive string arguments str_i and str_i+1, str_i is
lexicographically greater than str_i+1.
Lexicographic ordering predicate; return #t
if, for every pair of
consecutive string arguments str_i and str_i+1, str_i is
lexicographically greater than or equal to str_i+1.
Case-insensitive string equality predicate; return #t
if
all strings are the same length and their component
characters match (ignoring case) at each position; otherwise
return #f
.
Case insensitive lexicographic ordering predicate; return #t
if,
for every pair of consecutive string arguments str_i and
str_i+1, str_i is lexicographically less than str_i+1
regardless of case.
Case insensitive lexicographic ordering predicate; return #t
if,
for every pair of consecutive string arguments str_i and
str_i+1, str_i is lexicographically less than or equal to
str_i+1 regardless of case.
Case insensitive lexicographic ordering predicate; return #t
if,
for every pair of consecutive string arguments str_i and
str_i+1, str_i is lexicographically greater than
str_i+1 regardless of case.
Case insensitive lexicographic ordering predicate; return #t
if,
for every pair of consecutive string arguments str_i and
str_i+1, str_i is lexicographically greater than or equal to
str_i+1 regardless of case.
Apply proc_lt, proc_eq, proc_gt to the mismatch index, depending upon whether s1 is less than, equal to, or greater than s2. The mismatch index is the largest index i such that for every 0 <= j < i, s1[j] = s2[j] – that is, i is the first position that does not match.
Apply proc_lt, proc_eq, proc_gt to the mismatch index, depending upon whether s1 is less than, equal to, or greater than s2. The mismatch index is the largest index i such that for every 0 <= j < i, s1[j] = s2[j] – that is, i is the first position where the lowercased letters do not match.
Return #f
if s1 and s2 are not equal, a true
value otherwise.
Return #f
if s1 and s2 are equal, a true
value otherwise.
Return #f
if s1 is greater or equal to s2, a
true value otherwise.
Return #f
if s1 is less or equal to s2, a
true value otherwise.
Return #f
if s1 is greater to s2, a true
value otherwise.
Return #f
if s1 is less to s2, a true value
otherwise.
Return #f
if s1 and s2 are not equal, a true
value otherwise. The character comparison is done
case-insensitively.
Return #f
if s1 and s2 are equal, a true
value otherwise. The character comparison is done
case-insensitively.
Return #f
if s1 is greater or equal to s2, a
true value otherwise. The character comparison is done
case-insensitively.
Return #f
if s1 is less or equal to s2, a
true value otherwise. The character comparison is done
case-insensitively.
Return #f
if s1 is greater to s2, a true
value otherwise. The character comparison is done
case-insensitively.
Return #f
if s1 is less to s2, a true value
otherwise. The character comparison is done
case-insensitively.
Compute a hash value for s. The optional argument bound is a non-negative exact integer specifying the range of the hash function. A positive value restricts the return value to the range [0,bound).
Compute a hash value for s. The optional argument bound is a non-negative exact integer specifying the range of the hash function. A positive value restricts the return value to the range [0,bound).
Because the same visual appearance of an abstract Unicode character can
be obtained via multiple sequences of Unicode characters, even the
case-insensitive string comparison functions described above may return
#f
when presented with strings containing different
representations of the same character. For example, the Unicode
character “LATIN SMALL LETTER S WITH DOT BELOW AND DOT ABOVE” can be
represented with a single character (U+1E69) or by the character “LATIN
SMALL LETTER S” (U+0073) followed by the combining marks “COMBINING
DOT BELOW” (U+0323) and “COMBINING DOT ABOVE” (U+0307).
For this reason, it is often desirable to ensure that the strings to be compared are using a mutually consistent representation for every character. The Unicode standard defines two methods of normalizing the contents of strings: Decomposition, which breaks composite characters into a set of constituent characters with an ordering defined by the Unicode Standard; and composition, which performs the converse.
There are two decomposition operations. “Canonical decomposition” produces character sequences that share the same visual appearance as the original characters, while “compatibility decomposition” produces ones whose visual appearances may differ from the originals but which represent the same abstract character.
These operations are encapsulated in the following set of normalization forms:
Characters are decomposed to their canonical forms.
Characters are decomposed to their compatibility forms.
Characters are decomposed to their canonical forms, then composed.
Characters are decomposed to their compatibility forms, then composed.
The functions below put their arguments into one of the forms described above.
Return the NFD
normalized form of s.
Return the NFKD
normalized form of s.
Return the NFC
normalized form of s.
Return the NFKC
normalized form of s.
Next: String Searching, Previous: String Modification, Up: Strings [Contents][Index]