Ken Shirriff's blog: July 2008

The rise of scripting languages and the fall of Java

Java is very much in full retreat.
-- R. Loui

Professor Ronald Loui has an interesting article on the rise of scripting languages (In Praise of Scripting: Real Programming Pragmatism) in the July 2008 issue of IEEE Computer. It claims scripting languages such as Perl, Python, and Javascript have dramatically fulfilled their early promise, provide many benefits, and are poised to take over the lead from Java. However, the academic programming language community is stuck in theory and hasn't recognized the ascendence of scripting languages.

I agree that scripting languages are on the rise. Most people would agree that they provide rapid development, higher levels of abstraction, and brevity that helps the programmer. The article also describes how scripting languages can be a performance win, since they can allow experimentation and implementation of efficient algorithms that would be too painful in Java or C++. So even if C++ is faster on the micro-benchmark level, a programmer using a scripting language may end up with faster algorithms overall. I've argued somewhat controversally that Arc is too slow for my programming problems, so I remain unconvinced that basic performance can be ignored entirely.

As for the claim that Java is in full retreat, it strikes me as wishful thinking. (I'd believe "slow decline" though.) It will be interesting to check back on this claim in 5 years.

I personally believe that CS1 [freshman computer science] Java is the greatest single mistake in the history of computing curricula.
-- R. Loui

The article suggests good languages for teaching introductory computer science are gawk, Javscript, PHP, and ASP, but Python is emerging as a consensus for the best freshman programming language. This is the hardest part of the article for me to swallow. The idea of writing real programs in Awk never occurred to me, and I remain skeptical even though the author claims it works well. For those who would suggest Scheme as an introductory programming language, it was displaced as a dominant freshman language by Java a decade ago, and is apparently no longer considered an option.

I can't argue with the author's claim that student learning is enhanced by experimenting, writing code, and getting hands-on experience, and that scripting languages make this faster and easier.

Python and Ruby have the enviable properties that almost no one dislikes them, and almost everyone respects them.
-- R. Loui

In Why your favorite language is unpopular I discussed how the Change Function model can explain the success of programming languages based on maximizing the crisis solved and minimizing the perceived pain of adoption. I can apply this model applies to scripting languages as well:

Magnitude of crisis solved by Tcl/Tk: High - How to add a scripting language to a C program. How to add a GUI to a C program without painful X11 and Motif code.
Total Perceived Pain of Adoption: Low - Link Tcl in with your C program and add a few hooks. Create the GUI with trivial scripts.

Magnitude of crisis solved by Perl: High - How to quickly write CGI scripts. How to solve problems too complex for shell scripts. How to process files. How to develop quickly and iteratively.
Total Perceived Pain of Adoption: Low - Apart from looking like line noise, Perl is easy to get started with, is well integrated with Unix, has the definitive regex implementation, and has libraries for almost everything.

My point is that these languages solved specific painful problems and had low pain of adoption. As a result, they were much more successful than beautiful, powerful languages that were less able to directly solve painful problems or were more painful to adopt.

The real reason why academics were blindsided by scripting is their lack of practicality.
-- R. Loui

A major thrust of the article is that academics are too concerned with theoretical issues of syntax and semantics, rather than pragmatic issues of what a language can achive quickly, inexpensively, and practically. Academics are said to be too tied to theoretical concepts such as object-oriented programming and strong typing, and are missing the real-world benefits of scripting languages.

(Interestingly, Rob Pike made a similar argument against academics in the context of operating systems software (Systems Software Research is Irrelevant), stating that academic research is irrelevant and the real innovation is in industry. Since I have friends doing academic OS research, I should add a disclaimer here that I don't necessarily agree.)

One measure of pragmatics raised by the paper is how well does a language work with other Unix tools. I think the importance of this is underappreciated. In particular, I view this as a significant barrier to adoption of Arc. Running Arc as a shell script instead of a REPL is nontrivial (as is the case with many Lisp and Scheme implementations). Running an external program from Arc is clunky, even though it is often necessary to actually get things done (Kens' law), and real pipes are missing from Arc entirely.

Java's integration with Unix also has painful gaps - where's getpid() for instance? Why is JNI so difficult compared to calling native code from C#? I blame Sun's pure-Java platform independence ideology, and I'm surprised it hasn't hurt Java more.

On the other hand, Python and Perl provide a remarkable degree of integration, which I view as a key factor in their success. Likewise, Visual Basic is highly integrated with the Windows environment and highly successful there.

In conclusion, Loui's paper raises numerous interesting points about the success of scripting languages. I expect that the reasons for the rise of scripting languages will only get stronger, and languages that don't support the scripting model will have an increasingly harder time gaining adoption.

Note: quotes above are from the preprint and may not match the published article.

Why your favorite language is unpopular

The total world's population of Haskell programmers fits in a 747. And if that goes down, nobody would even notice.
-- Erik Meijer

I recently saw an interesting talk on functional programming by Erik Meijer (of Bananas, Lenses, Envelopes, and Barbed Wire fame). Among other things, he discussed why many superior technologies such as Haskell don't catch on.

Geek formula for success

He claims the "Geek formula" for success of a technology is that if a technology is 10 times better, it should catch on and become popular. Even if it is slower, Moore's law will soon make it 10 times faster.

So if Haskell is 10 times better than C and Haskell programs are 10 times shorter, everybody should be using Haskell.

Real-life formula for success

However, as Erik points out, "That's not how it is in real life." In real life, success is based on the perceived crisis divided by the perceived pain of adoption. Users want something that will get the job done and solve their crisis, without a lot of pain to switch.

This argument applies to many languages that remain unpopular despite their technical merits, such as Lisp, Arc, and Erlang, as well as technologies such as the Semantic Web and LaTex.

The Change Function

The above argument is based on the book The Change Function: Why Some Technologies Take Off and Others Crash and Burn by Pip Coburn. To summarize the book, new technologies aren't adopted because they are great, new, and disruptive; they are adopted only if the user's crisis solved by the technology is greater than the perceived pain of adoption. As a result, most new technologies fail.

The first half of the book is a bit fluffy, but gets more interesting when it discusses specific technologies that failed or succeeded. The book also goes out on a limb and predicts future winners (mobile enterprise Email, satellite radio, business intelligence software) and losers (RFID, entertainment PC, WiMax).

Languages and The Change Function

The Change Function argument has a lot of merit for explaining what languages become popular and what languages don't. If Lisp is so great, why are there 8 million Visual Basic programmers worldwide and few Lisp programmers? The answer isn't pointy-haired bosses (since Lisp isn't popular on SourceForge either). The crisis vs. pain of adoption model provides a powerful explanation:

Magnitude of crisis solved by Visual Basic: High (e.g. how to easily write Windows applications)
Total Perceived Pain of Adoption for Visual Basic: Very Low (hit Alt-F11 in Excel and you're done)

Magnitude of crisis solved by Lisp: Low (metaprogramming, powerful macros, and higher-order functions are solutions in search of problems)
Total Perceived Pain of Adoption for Lisp: High (this shouldn't require explanation)

The same model explains the success of, for instance, Java:
Magnitude of crisis solved by Java: High (originally how to run code in a browser and write portable code, now how to avoid crashes due to memory allocation errors and bad pointers)
Total Perceived Pain of Adoption: Low (syntax similar to C++, easy to deploy)

Applying this model to other languages is left as an exercise for the reader.

Erik points out that Erlang and Haskell are now being marketed according to the second formula: there is a multicore crisis and functional languages are the solution. It will be interesting to see how much additional traction these languages get without addressing the "pain of adoption" part.

The Change Function and startups

The Change Function ends with ten sets of questions and a set of techniques for designing technologies that will be adopted; this part of the book has many ideas that would be beneficial for startups. Many of these are fairly obvious, such as "Fail fast and iterate", and have a customer-centered culture instead of a sales-centered culture, while others are more thought-provoking: "What is the user crisis you intend to solve? What are the top five reasons a user with this crisis would not use your product?" The ultimate conclusion of the book is "Figure out what people really want!", which brings to mind the advice to make something people want.

"Maxwell's equations of software" examined

A recent post quotes Alan Kay's statement that expressing Lisp in itself is the "Maxwell's Equations of Software":

Yes, that was the big revelation to me when I was in graduate school—when I finally understood that the half page of code on the bottom of page 13 of the Lisp 1.5 manual was Lisp in itself. These were “Maxwell’s Equations of Software!”

This quote appears many places on the web, but the code itself is harder to find. What is this amazing half page of code?

The Lisp 1.5 Manual, which was written by John McCarthy et al in 1961, is available at softwarepreservation.org. In it, the "Maxwell's equations" define a universal Lisp function evalquote that can evaluate any given function:

evalquote[fn;x] = apply[fn;x;NIL]

where

apply[fn;x;a] =
     [atom[fn] -> [eq[fn;CAR] -> caar[x];
                  eq[fn;CDR] -> cdar[x];
                  eq[fn;CONS] -> cons[car[x];cadr[x]];
                  eq[fn;ATOM] -> atom[car[x]];
                  eq[fn;EQ] -> eq[car[x];cadr[x]];
                  T -> apply[eval[fn;a];x;a]];
     eq[car[fn];LAMBDA] -> eval[caddr[fn];pairlis[cadr[fn];x;a]];
     eq[car[fn];LABEL] -> apply[caddr[fn];x;cons[cons[cadr[fn];
                               caddr[fn]];a]]]

eval[e;a] = [atom[e] -> cdr[assoc[e;a]];
     atom[car[e]] ->
      [eq[car[e],QUOTE] -> cadr[e];
      eq[car[e];COND] -> evcon[cdr[e];a];
      T -> apply[car[e];evlis[cdr[e];a];a]];
      T -> apply[car[e];evlis[cdr[e];a];a]]

evcon[c;a] = [eval[caar[c];a] -> eval[cadar[c];a];
      T -> evcon[cdr[c];a]]

evlis[m;a] = [null[m] -> NIL;
      T -> cons[eval[car[m];a];evlis[cdr[m];a]]]

The above code is defined in a meta-language (M-expressions), which can be straighforwardly translated into S-expressions. Functions in M-expressions use square brackets and have arguments separated by semicolons. M-expressions conditionals are of the form [predicate -> value; predicate -> value; ...]. M-expression label is analogous to defun or define.

The point of all this is that M-expressions are the code that operates on the S-expression data, but the M-expression meta-language and S-expression data actually coincide. Thus, code and data are the same thing in Lisp, and a half-page of code is sufficient to define a basic Lisp interpreter in Lisp given a few primitives (car, cdr, cons, eq, atom). The code presents a Meta-circular evaluator for Lisp; see (SICP chapter 4.1 for more details on metacircular evaluators. (Unfortunately, this won't give you a working Lisp interpreter for free; things such as the garbage collector, the list primitives, and parsing need to be implemented somewhere. Also note that this metacircular evaluator doesn't give you niceties such as arithmetic.))

To understand the above code, apply takes a function and argument, while eval acts on a form. The last argument to these is an association list for the environment, which stores the values of bound values and function names. In brief, apply implements CAR, CDR, CONS, ATOM, and EQ in terms of the primitives. It implements LAMBDA by pairing up the variables and arguments and passing them to eval. It implements LABEL (which defines a function) by adding the function name and definition to the association list.

The code for eval processes a form in a straightforward manner. It handles the QUOTE form by returning the quoted value. It handles COND by evaluating the predicates with the help of evcon. Otherwise, it interprets an atom as a variable and returns the value. If given a list, it interprets this as a function application; the arguments are evaluated with evalis and the function is evaluated by apply.

The above code is not quite complete; it relies on some other simple functions that were previously defined in the manual, such as equals and cadr Less obvious functions are pairlis[x;y;a] pairs up lists x and y and adds them to association list a. assoc[x;a] looks up x in association list a. sublis[a;y] treats association list a as a mapping of variables to values, and replaces variables in S-expression y with the associated variables. These functions can be straightforwardly built from the primitive functions.

(By the way, I'm pretty sure the comma in eq[car[e],QUOTE] is a typo, but that's how it is in the original.)