String operations

String operations

newstring length [char]
Creates a string of the given length.
>(newstring 5 #\a)
"aaaaa"
whitec c
Predicate to test if a character is whitespace (space, newline, tab, or return).
>(whitec #\tab)
t
>(whitec "  ")
nil
nonwhite c
Predicate to test if a character is not whitespace (space, newline, tab, or return).
>(nonwhite #\tab)
nil
>(nonwhite #\a)
t
letter c
Predicate to test if a character is alphabetic. New in arc3.
>(letter #\A)
t
>(letter #\2)
nil
alphadig c
Predicate to test if a character is alphabetic or a digit.
>(alphadig #\A)
t
>(alphadig #\2)
t
punc c
Predicate to detemine if c is a punctuation character in the set: .,;:!?
>(punc #\.)
t
>(punc #\a)
nil
>(punc ".")
nil
downcase str
Converts a string, character, or symbol to lower case. This only converts ASCII; Unicode is unchanged.
>(downcase "abcDEF123")
"abcdef123"
>(downcase #\A)
#\a
>(downcase 'abcDEF123)
abcdef123
upcase str
Converts a string, character, or symbol to lower case. This only converts ASCII; Unicode is unchanged.
>(upcase "abcDEF123")
"ABCDEF123"
>(upcase #\a)
#\A
>(upcase 'abcDEF123)
ABCDEF123
ellipsize str [limit]
If str is longer than the limit (default 80), truncate it and append ellipses ('...').
>(ellipsize "Too long" 6)
"Too lo..."
rand-string n
Generates a random string of alphanumerics of length n.
>(rand-string 10)
"FPEMMZuVrt"
string arg [...]
Converts the args into a string. The args must be coerce-able to a string.
>(string 2 'a '(#\b #\c))
"2abc"
recstring f string [start]
Recursively steps through the string until f returns a non-nil value, and returns that value. Returns nil otherwise. The values passed to f are integer indices; the indices start at 0, or start if specified.
>(let str "abcde"
  (recstring
    (fn (idx) (if (is (str idx) #\c) (+ 10 idx)))
      str))
12
w/bars [statement ...]
Executes each statement and collects the output. A vertical bar is placed between the output of statements producing output.
>(w/bars (pr "a") 42 (pr "b") (pr "c") nil)
a | b | c
nil
bar*
The separator string used by w/bars.
>bar*
" | "

strings.arc library

tokens str [sep]
Splits str into tokens based on the separator (default whitec). sep is either a predicate function on a character, or a character.
>(tokens " cat dog bird  lizard\nrat\tmouse")
("cat" "dog" "bird" "lizard" "rat" "mouse")
>(tokens "foo.bar..baz" #\.)
("foo" "bar" "baz")
>(tokens "a!bc%de@f" (fn (c) (pos c "!@%")))
("a" "bc" "de" "f")
urldecode str
Decodes a string in URL encoding. The string is assumed to be UTF-8
>(urldecode "abc+def %c2%a9")
"abc def ©"
litmatch pat seq [start]
Tests if seq starting at offset start begins with pat. Because of the macro expansion, pat must be a literal string or list of characters and not a variable. seq can be a string or list of characters.
>(litmatch "abc" "abcde")
t
>(litmatch "abc" "xabcde")
nil
>(litmatch "abc" "xabcde" 1)
t
>(litmatch (#\a #\b #\c) "abcde")
t
endmatch pat seq
Tests if seq ends with pat. Because of the macro expansion, pat must be a literal string or list of characters and not a variable. seq can be a string or list of characters.
>(endmatch "abc" "abcde")
nil
>(endmatch "cde" "abcde")
t
>(endmatch (#\c #\d #\e) "abcde")
t
posmatch pat seq [start]
If pat is a string or list of characters, return the index (from start) where pat appears in seq. If pat is a predicate function on one character, it is applied to the characters of seq (starting from start), and returns the index of the first true result. seq is a string or list of characters.
>(posmatch "abc" "junk")
nil
>(posmatch "a" "banana" 2)
3
>(posmatch (fn (c) (in c #\a #\b)) "foobar")
3
>(posmatch '(#\a #\b) '(#\c #\a #\b))
1
findsubseq pat seq [start]
Finds the index where pat is a subsequence of seq, starting at start. Both pat and seq can be strings or lists of characters. findsubseq is similar to posmatch, except it doesn't accept a function for pat.
>(findsubseq "abc" "fooabcbar")
3
>(findsubseq "abc" "x")
nil
>(findsubseq "an" "banana" 2)
3
headmatch pat seq [start]
Tests if seq from offset start onwards starts with pat. pat and seq can be strings or lists of characters. headmatch will die if pat is longer than seq and matches up to the end of seq.
>(headmatch "abc" "abcde")
t
>(headmatch "cd" "abcde")
nil
>(headmatch "cd" "abcde" 2)
t
>(headmatch '(#\a #\b) '(#\a #\b #\c))
t
begins seq pat [start]
Tests if seq begins with pat. begins is the same as headmatch with the first two arguments reversed, except begins doesn't die if matching goes past the end of seq.
>(begins "abcde" "abc")
t
>(begins "abc" "abcde")
nil
>(begins "abcde" "cd" 2)
t
subst new old seq
Substitutes new for old in seq. new can be any printable object. old and seq can be strings or lists of characters.
>(subst "bar" "foo" "catfood dogfood")
"catbard dogbard"
>(subst '(1 2) "a" "banana")
"b(1 2)n(1 2)n(1 2)"
multisubst pairs seq
Performs multiple substitutions on seq. pairs is a list of pairs of old and new values.
>(multisubst '(("a" 1) ("b" "B")) "banana")
"B1n1n1"
blank str
Tests if str is blank (whitespace).
>(blank "a b")
nil
>(blank " ")
t
>(blank '(#\space #\tab #\newline))
t
trim str where [test]
Trims whitespace (or arbitrary expression) from str. where can have the value 'front, to trim the front of the string (currently broken); 'end, to trim the end of the string; or 'both to trim both ends of the string. If specified, test is either a character or a predicate function on characters.
>(trim " abc " 'end)
" abc"
>(trim " abc " 'both)
"abc"
>(trim "aabcaa" 'both #\a)
"bc"
>(trim "aabcaa" 'both (fn (_) (in _ #\a #\b)))
"c"
num m [digits [trail-zeros [init-zero]]]
Formats a real number. digits is the number of digits after the decimal point, trail-zeros is a Boolean indicating if trailing zeros should be included, and init-zero is a Boolean indicating if there should be a 0 before the decimal point.
>(num 123456)
"123,456"
>(num -123456)
"-123,456"
>(num 1.2345 2)
"1.23"
>(num 1.2 4 t)
"1.2000"
>(num 0.3 4 t t)
"0.3000"
pluralize n str
Returns str pluralized; if n is 1 or a list of length 1, str is returned unchanged; otherwise an 's' is appended. Renamed from plural in arc3.
>(pluralize 2 "fox")
"foxs"
>(pluralize '() "fish")
"fishs"
plural n str
Returns n and str pluralized. New in arc3.
>(plural 2 "fox")
"2 foxs"
>(plural '() "fish")
" fishs"
halve str
Splits a string in two on whitespace. New in arc3.
>(halve "ab cd ef")
("ab" " cd ef")
>(halve "abc")
("abc")
positions test seq
Returns a list of positions in seq where test is true. Works on sequences in general. New in arc3.
>(positions #\a "That abacus")
(2 5 7)
>(positions odd '(1 2 4 5 7))
(0 3 4)
lines str
Splits str into lines. New in arc3.
>(lines "a b\nc d\n\ne f")
("a b" "c d" "" "e f")
urlencode str
Completely url-encodes str. Doesn't work with unicode. New in arc3.
>(urlencode "abc")
"%61%62%63"
>(urlencode "☃")
"%2603"
nonblank str
Returns str if it is nonblank, nil otherwise. New in arc3.
>(nonblank "a b")
"a b"
>(nonblank "\n\t ")
nil

Copyright 2008 Ken Shirriff.