Thursday, April 24, 2008

Idioms for programming in Arc

The code that implements Arc illustrates many useful programming techniques. Some of the techniques occur multiple times, and can be considered idioms of Arc programming; techniques that can be applied to many problems. This article describes some of the most common idioms, and how they can be used in Arc programming. The techniques described focus largely on macros and recursion.

Macros

Not surprisingly, given Paul Graham's enthusiasm for macros, Arc makes heavy use of macros. A few idioms appear commonly in the code.

"Decorated body"

The most common use of macros in Arc is something I call the "decorated body", which wraps a chunk of body code with something to extend it, analogous to the decorator design pattern. Typically the decorator will set something up before executing the body code, as in w/outstring, w/table, and fromstring. (Note: the link for each operation goes to the documentation. Clicking on the code icon in the documentation will show the code for the operation.) The pattern is also used to set something up before execution and then clean up afterwards. This is especially useful for file operations, as in w/infile, w/outfile, and w/instring. These macros often have names starting with "w/", short for "with/", indicating what they are providing to the body. The HTML library makes heavy use of this idiom, for example in tag, form, and cdata, which wrap the body in various HTML opening and closing tags.

To examine the implementation of this idiom, consider let, which is defined as:

(mac let (var val . body)
  `(with (,var ,val) ,@body))
The typical features of this idiom are a macro definition with argument . body to pick up the remainder of the arguments, a macro definition starting with quasiquote `, and unquote-splicing ,@body to substitute the body code into the macro. The macro performs simple substitution:
arc> (macex1 '(let foo 42 body1 body2))
(with (foo 42) body1 body2)

Macro for control of evaluation

Unlike functions, macros receive their arguments unevaluated. This allows macros to be used for conditional execution of code, or to execute code multiple times iteratively. Some examples are when, unless, while, and for.

(You might expect if to be a macro. It is actually a built-in special form, since conditional macros need to be based on some underlying conditional form.)

The code for repeat illustrates that this idiom also uses the ,@body structure as "decorated body":

(mac repeat (n . body)
  `(for ,(uniq) 1 ,n ,@body))

Macro as lambda adapter

A variant of the body decorator is to use a macro to convert body code into a lambda function, which is similar to an adapter design pattern. This idiom is used when a function expects to receive a lambda function, but passing in body code is more convenient, especially when the base function is a Scheme primitive. Examples are thread, atomic, w/stdin, and w/stdout:
(mac w/stdout (str . body)
  `(call-w/stdout ,str (fn () ,@body)))
arc> (macex '(w/stdout foo body1 body2))
(call-w/stdout foo (fn nil body1 body2))

Macro to create functions with parameters

Another common macro idiom is to take a list of parameters and body code, and build a function out of these. This idiom is typically used to define a function that has special properties, and is often given a name that starts with "def". Some examples are rfn, afn, defmemo, defop, defopr, linkf, and rlinkf.

The code for defmemo shows the key component of this idiom: (fn ,parms ,@body) to build the function out of the parameters and body.

(mac defmemo (name parms . body)
  `(safeset ,name (memo (fn ,parms ,@body))))

Complex macros

One irony is that even though though macros are extremely powerful, most of the macros in Arc perform simple substitutions. There are, however, some macros that perform complex manipulation of their arguments. Assignment is built from the setform macro, which implements generalized variables through intricate macro processing of the arguments. Slightly less complex is and, which implements short-circuit evaluation by turning its arguments into nested if statements. Another interesting example is with; the macro splits up the arguments into variables and values:
arc> (macex '(with (a 1 b 2) (prn a b)))
((fn (a b) (prn a b)) 1 2)
The litmatch macro performs string matching by generating code to perform a sequence of character comparisons.
arc> (macex '(litmatch "abc" foo))
((fn (gs1442 gs1443) (unless (> (+ gs1443 3) (len gs1442))
  (and (is #\a (gs1442 (+ gs1443 0))) (is #\b (gs1442 (+ gs1443 1)))
       (is #\c (gs1442 (+ gs1443 2)))))) foo 0)

Recursion

Tail recursion is a standard Scheme technique for efficient recursion. If the last operation of a function is a recursive call, the compiler will use tail-call optimization to turn the recursion into iteration, avoiding creation of a stack frame on each call.

Iteration in Arc is built on top of tail recursion. Basic loop operations such as while, loop, and whilet use tail recursion on a function built from rfn:

(mac while (test . body)
  (w/uniq (gf gp)
    `((rfn ,gf (,gp)
        (when ,gp ,@body (,gf ,test)))
      ,test)))

cdr-scanning

A typical idiom for scanning a list is to tail-recursively call the function on the cdr of the list. Some examples are reclist, nthcdr, last, and rev.

The code for assoc illustrates this idiom. If there is no match, assoc is called on the cdr, until the whole list has been scanned:

(def assoc (key al)
  (if (atom al)
       nil
      (and (acons (car al)) (is (caar al) key))
       (car al)
      (assoc key (cdr al))))

Tail-recursive string functions

One idiom in Arc for using tail recursion with strings is to add an index parameter to the function. This index parameter indexes into the string, and by incrementing it in each call, the function can be made tail-recursive and doesn't require modifying the string. The index parameter is often made an optional parameter so "normal" uses of the function don't need to specify the index. Some examples of this idiom are recstring, findsubseq, indented-code, parabreak, and urlend.

The code for headmatch illustrates this idiom; in this case the tail-recursion uses an anaphoric function:

(def headmatch (pat seq (o start 0))
  (let p (len pat) 
    ((afn (i)      
       (or (is i p) 
           (and (is (pat i) (seq (+ i start)))
                (self (+ i 1)))))
     0)))

Recursion with cons

Many operations in Arc need to perform operations on a sequence and create a result list. The common Arc idiom is to recurse through a sequence, building up the result with cons. Specifically, a macro (op seq) will return the cons of something done to (car seq) and (op (cdr seq)). Thus, op recursively steps through seq, processing an element at a time and building up the result. This idiom provides an interesting contrast to the previous recursion examples because this idiom is not tail-recursive. Because the cons is applied to the result of the recursive function call, tail-call optimization cannot be applied. Several examples of this idiom are join, rem, reinsert-sorted, pair, firstn, firstn-that, tuples, and range.

The code for map1 illustrates this idiom:

(def map1 (f xs)
  (if (no xs) 
      nil
      (cons (f (car xs)) (map1 f (cdr xs)))))
Note that f is applied to (car xs), map1 is called recursively on (cdr xs), and the two are cons'd together.

Miscellaneous

The majority of the idioms I discuss are based on macros or recursion. This secion discusses some miscellaneous idioms that don't fit those categories.

Push and reverse

The "push and reverse" idiom is an efficient, but somewhat surprising way to generate a list. In this idiom, successive elements are pushed onto the front of the list, and when the entire list is generated, it is reversed to obtain the desired order. This is a well-known Lisp idiom and is more efficient than the obvious method of appending elements to the end of the list. Several examples of "push and reverse" are drain, dedup, parse-format, litmatch, endmatch, and handle-post.

The code for n-of illustrates the idiom. Items are added with push to an accumulator (ga in this case), and the accumulator is reversed with rev at the end:

(mac n-of (n expr)
  (w/uniq ga
    `(let ,ga nil     
       (repeat ,n (push ,expr ,ga))
       (rev ,ga))))

String stdout

Arc makes extensive use of an idiom I'll call "string stdout" because a string is used as stdout. The code writes multiple chunks to stdout, and the results are gathered into a string. This idiom avoids having to manually concatenate the components into the result string. The tostring macro is used to implement this idiom. Arc has a surprisingly large number of macros that use this idiom, including readline, code-block, unmarkdown, varfield, markdown, eschtml, striptags, parafy, subst, multisubst, and num.

Typically, code using this idiom will use produce output with pr or writec, and use tostring to collect the output. The code for multisubst illustrates this:

(def multisubst (pairs seq)
  (tostring 
    (forlen i seq
      (iflet (old new) (find [begins seq (car _) i] pairs)
        (do (++ i (- (len old) 1))
            (pr new))
        (pr (seq i))))))

The flip side of this idiom is the "string stdin" idiom, where a string is used as stdin. The "string stdin" idiom is rarely used; one example is parse-format.

Summary

The Arc code illustrates several common Arc idioms, some of which are well-known Scheme or Lisp idioms. Since Paul Graham is a Lisp expert, carefully studying his Arc code is a rewarding way to learn more about programming Lisp-like languages.

Undoubtedly I've missed some important idioms. Please let me know if your favorite idiom is missing, or if I've mangled the descriptions of any idioms.

Monday, April 14, 2008

Many Arc compilers and interpreters

The introduction of the Arc language has spawned a mini-boom of Arc compilers and interpreters implemented in everything from JavaScript to .NET. Because Arc is a fairly simple Lisp dialect, it is relatively straightforward to implement, and many people are using it to learn about compiler and interpreter design, and to showcase different implementation techniques and languages.

Paul Graham's official distribution started it all, of course. This version is implemented in MzScheme and translates Arc into Scheme code. The writeup on Arc Internals provides more information on how it is implemented.

The Arc Forum is the site of most of the unofficial Arc development effort. The "Anarki" branches of Arc on github consist of a "stable" bug-fixed version of official Arc, and a more exploratory version with many modifications. (For information on using github, see Brief Guide to Anarki and Git.

Anarki includes "ac.sbcl.lisp", an implementation of Arc in SBCL. It implements closures using CPS (Continuation-passing style) transformations, but doesn't include threads or networking according to the brief description.

Perhaps the first reimplementation of Arc was ArcLite, which is Arc ported to JavaScript. This version of Arc is unique because it runs in the browser. The running version can be accessed online. It provides a subset of functionality, omitting tail-call optimization, continuations, and various operations.

Rainbow (l'arc en ciel in French) is an implementation of Arc in Java. It is fairly complete, providing continuations and tail-call optimization according to the description. It is available at github.

An Arc compiler for .NET has been developed, based on the MBase compiler prototyping framework. The description states that it implements a subset of Arc, but is much faster than standard Arc. It is available from Meta Alternative.

Currently under active development is "arc2c", which compiles Arc to C. Current status is at arclanguage.org. This compiler is intended to be a full, high-performance implementation of Arc. It uses CPS transformations to implement continuations. It uses the Boehm garbage collector. arc2c is available at github.

CL-Arc is a proposal to compile Arc to Common Lisp; work hasn't started on it yet.

Many of these Arc implementations actively welcome participation. The future is likely to hold continuing improvement in the quality, functionality, and performance of Arc implementations, as well as surprising and unusual variants.

Thursday, April 10, 2008

Ajax and Arc

Ajax is surprisingly easy to use with the Arc language. This article shows how to implement Ajax-based autocompletion and dynamic page updates using the script.aculo.us and Prototype JavaScript frameworks. These frameworks allow Ajax applications to be implemented with just a few lines of additional code. The hard work is done by the script.aculo.us and Prototype libraries. All the Arc code needs to do is provide the list of autocompletion candidates, and the dynamic page content.

The example

This example implements a simple Ajax input autocompleter and dynamic page update. For the autocompleter, as soon as you enter a few letters into the input, it provides a list of matching terms. As soon as you select a term, the page is dynamically updated with associated data, as in the following screenshot:
screenshot
For the purposes of the demonstration, the content is information on the countries of the world, obtained from the CIA World Factbook.

To summarize how it all works: When you enter characters in the input field, the script.aculo.us autocompleter JavaScript sends the charaters to the Arc server, which returns the autocomplete suggestions. The autocompleter JavaScript displays the suggestions on the page. When you select a country, the updater JavaScript does three separate requests to the Arc server, to request the population, area, and capital. When the responses arrive, they are inserted into the page.

In more detail, the autocompletion is performed by Ajax.Autocompleter, and the dynamic updating is performed by Ajax.Updater. The control flow is:

  • The Ajax.Autocompleter associated with the autocomplete input and autocomplete_choices div sends inputs to /auto_complete?prefix=chars.
  • The Arc handler sends back a formatted list of autocomplete candidates, e.g. <ul><li>Iran</li><li>Iraq</li><li>Ireland</li></ul>.
  • The Ajax.Autocompleter displays the candidates in the autocomplete div.
  • When an autocompletion is selected, the update JavaScript function starts three Ajax.Updater instances, which send an Ajax request to URL /getcontents?field=population&name=value, and similarly for area and capital in parallel.
  • The Arc handler for getcontents returns the desired contents.
  • Ajax.Updater puts the contents into the contents div.

Running the example

First set up the necessary files.
  • Download ajax.zip and uncompress it into the same directory that Arc runs in. This zip file provides ajax.html, ajax.arc, data.arc, and autocomplete.css.
  • Download the script.aculo.us library files from script.aculo.us. Copy the .js files from lib and src into the same directory that Arc runs in.
Next, start up the Arc interpreter, load the ajax.arc file, and start the web server.
arc> (load "ajax.arc")
arc> (thread (serve 8080))
arc> ready to serve port 8080
Then go to http://localhost:8080/ajax.html. (Unfortunately I don't have a live demo on a public machine. If anyone sets one up, let me know and I'll add a link here.)

Start typing in the box, and you should see a dropdown with autocompleted choices. Click one, and the code will be displayed below asynchronously.

The Arc code

The Arc server code is in ajax.arc. Two Arc handlers are implemented to provide the client-side Ajax support. The first, auto_complete receives the current input contents and returns a list of up to 10 autocompletion candidates formatted in HTML. The second handler, getcontents returns the dynamic content for the page, from the database table. Note that the handlers do nothing particularly special to make them "Ajax"; they are standard Arc web handlers based on defop and can be accessed directly through the browser. See Arc Web Server for more information on web serving with Arc.
(defop auto_complete req
  (let prefix (downcase (arg req "prefix"))
    (prn (to-html-list (cut (startselts prefix keylist) 0 10)))))

(defop getcontents req 
  (with (field (arg req "field")  name (arg req "name"))
    (if (is field "population") (prn ((database name) 1))
        (is field "area") (prn ((database name) 2))
 (is field "capital") (prn ((database name) 3)))))
The handlers use a couple helpers; startselts returns the elements that match the autocomplete prefix and to-html-list wraps the elements in HTML list tags.
; Returns a list of elements that start with prefix
(def startselts (prefix seq) (rem [no (begins (downcase _) prefix)] seq))

; Wraps elts in HTML list tags
(def to-html-list (elts) (tostring
  (prall elts "<ul><li>" "</li><li>")
  (pr "</li></ul>")))
The actual content is obtained from data.arc, which contains the information on each country as a list of lists.
("United States" "301,139,947" "9,826,630" "Washington, DC")
Some simple code converts the list into a table called database indexed by country for easy lookup, and generates a sorted list of countries for use in autocompletion:
(= database (table))
(w/infile inf "data.arc"
  (let datalist (sread inf nil)
    (each elt datalist
      (= (database (elt 0)) elt))))
(= keylist (mergesort < (keys database)))

For some reason, the default Arc web server srv.arc doesn't support .js files. The ajax.arc file modifies the server slightly to support these files:

(= (srv-header* 'text/javascript) 
"HTTP/1.0 200 OK
Content-Type: text/javascript; charset=UTF-8
Connection: close")

(def static-filetype (sym)
  (let fname (string sym)
    (and (~find #\/ fname)
         (case (last (check (tokens fname #\.) ~single))
           "gif"  'gif
           "jpg"  'jpg
           "css"  'text/css
           "txt"  'text/html
           "html" 'text/html
           "js"   'text/javascript
           ))))

Debugging

Several things can go wrong when trying to run the example. If the initial page doesn't load at all, something is probably wrong with the Arc server. If no autocompletion happens, the JavaScript may not be loading, or the Arc server may have problems. If the autocompletion shows up as a bulleted list, the CSS file is probably not loading.

The Arc code can be debugged in several ways. The first is to access the handlers directly. The autocomplete handler http://localhost:8080/auto_complete?prefix=a should return a bullet list of autocomplete suggestions. The contents handler: http://localhost:8080/getcontents?field=area&name=Algeria should return the dynamic contents value.

The Arc code can also be debugged by strategically inserting print statements such as:

(write req (stderr))
This will display the request on the Arc console.

To debug Ajax, you can use Firefox plugins Live HTTP Headers and Firebug. Live HTTP Headers lets you see the headers between the browser and the server, and is very helpful to determine if something is failing. Firebug allows much more in-depth JavaScript debugging. The browser's JavaScript console is also useful to see JavaScript errors.

This article has just scratched the surface of script.aculo.us and Ajax. More information is available at the script.aculo.us website or in books.

Friday, April 4, 2008

Documentation of HTML generation in the Arc language

I've created documentation of Arc's HTML generation operations (from arc.arc). There are a lot of different functions there, but I've tried to cover them all.