François-René Rideau (fare) wrote,
François-René Rideau

  • Mood:
  • Music:

(Lots of ((Irritating, Spurious) (Parentheses)))

Derisive comments are often made about the syntax of Lisp, as witness some reproaches on my previous blog entry. Thus the half-joking, half-serious backronym of Lots of (Insipid | Irritating | Infuriating | Idiotic | ...) and (Spurious | Stubborn | Superfluous | Silly | ...) Parentheses, and accusations that Lisp syntax would make code incomprehensible to read and error-prone to write. I will take exception to this general kind of comments, and I will argue in defense of the Lisp syntax.

Firstly, the nested parenthesized syntax is actually not bad at all, as far as understanding goes. It is actually so simple that anyone can grasp it and master it in 20 minutes. Parentheses might be tricky to balance by hand, but since the 1970s at least, interactive development environments match parentheses visually for you, so you never have to worry about counting them. Moreover, properly indenting code, which may be done automatically by editors, makes code quite readable. Using Lisp with improper user interfaces meant for other languages may be tedious, but this is hardly a technical claim against Lisp.

By comparison, it took me many weeks to initially master the intricacies of the C syntax: the operator precedence, the semi-colon as terminator not separator, the confusion between assignment and comparison, the overloading of the parenthesis and comma syntax (precedence overriding, function calling, or sequencing of expressions?), the misleading (and in C++ ambiguous) similarity between declarations and statements, the trap of nested braceless if-else constructs, the trickiness of declaring type of pointers to functions, the weird limitations of the preprocessor and its macros, etc.

And then, I haven't even started talking about the limitations to the semantics of C, that force you to go through the pains of syntactically and semantically ugly work-arounds. Limitations on scoping, on returning of results, on static or dynamic initialization and finalization, on error handling, etc., induce a lot of error-prone inefficient pointer manipulations and gotos. Not only are multiple results and dynamic bindings/windings simpler for humans to read and write, but with them, compiled Lisp code may often win over compiled C or C++ code thanks to better calling conventions. And yes, this is related to the question of syntax, in as much as the regular nestable syntax of Lisp makes for a natural nestable recursive semantics, whereas the ad-hoc syntax of C makes for non-nestable semantics with an ad-hoc hierarchy of levels.

In the same comment to my blog entry, Lisp syntax is accused of inducing endless levels of nesting. I dispute this notion. I find that typically, the depth of nested structures in Lisp programs is not higher than the depth of a parse tree in typical Perl or C programs; however, the shape of the source trees is indeed different -- and that's the very purpose of having a different syntax: it's not just changing the surface of the syntax, but also its contents. And the contents are such that with the proper macrology, you can get more done for a same amount of manageable complexity: program structure naturally grows until it hits the barrier of human understanding; what changes from one language to another is not this barrier, builtin in man, but the amount of useful things you can express within this barrier; and as argued in my previous post, Lisp syntax is a key part of what enable programmers to express more through readily-available metaprogramming practices.

One argument against the Lisp syntax is the alleged redundancy of parentheses and keywords, which makes for low code density per line. However, (1) the semantics of Lisp programs makes up for it largely with the need of much fewer lines of code for same functionality, making overall code density of Lisp is higher than in any other language; (2) Lisp syntax and semantics can be extended and are casually extended with reader macros and macros (if you don't know what they are, you've pretty much missed the whole point of Lisp), so that you can match exactly the code density and readability you want for any specific domain where you type lots of code; and (3) despite the long readable symbol names, completion allows for fast typing and local bindings allow for short code. If the argument were really about high code density per line, one should not advocate any of the mainstream programming languages; one should much rather advocate APL, TECO, Mathematica, and other such languages that make colloquial programs look much more like line-noise to non-fluent readers than proverbial Perl programs ever will. Actually, APL and its current successors J or K are languages worthy of much respect.

Another argument against Lisp syntax is that its parentheses do not help distinguish between important semantic distinctions within program code. Well, the claim is true, but it's not a valid argument, it's a non-sequitur. Indeed, in Lisp, parentheses indicate code nesting, but not semantic distinctions; that doesn't mean these distinctions cannot be made easily in Lisp; it just means that in Lisp, these distinctions are carried by other means, namely the head symbol of each program form. This is arguably clearer than what happens in the often ambiguous syntactic mishmash of some other languages, where a prefix, postfix or infix operator may completely change the meaning of an expression, so that code must be read carefully and holistically before the precise semantic distinctions can be assessed. In any case, this is another case of wrong-headed arguments about Lisp based on projecting expectations that are only valid within the context of using other languages.

As for what makes parentheses necessary, it isn't, as many people ignorantly claim, a matter of prefix syntax as opposed to infix syntax. It is a matter of fixed arity versus variable arity. Prefix syntax can go wholly without parentheses, when the arity of each operator is known. Actually, the first systematic prefix syntax, the famous Polish Notation of Jan Lukasiewicz, was precisely devised as a way to get rid of parentheses altogether. The postfix variant of this notation, known as Reverse Polish Notation, has found a lot of practical uses, in FORTH, POP-2, HP calculators, PostScript, and many virtual machines. What makes parentheses necessary is the fact that Lisp has a single uniform extensible syntax that needs to accomodate for arbitrary number of arguments in program forms. As I argued before, it wasn't even a conscious design, but a historical discovery, that survived because it has many beneficial aspects. Actually, some Lisp dialects have a definite infix flavor (consider PLASMA). But what makes parentheses necessary is the fact that the same syntax needs be extensible to variadic functions and forms. The Lisp-derived language LOGO gets rid of parentheses exactly this way: by not having functions of variable arity. There could have been a paren-less sublanguage for simple operators, plus a generic parenthesized syntax for the general case, but in the early days of Lisp, such a disuniformity of syntax couldn't be afforded, and afterwards, it was found that the domains of applications of Lisp were so diverse that you could never standardize a specific ad-hoc syntax for every single fixed-arity operator that happened to be in frequent use in some or some other sub-domain. Instead, Lisp standardizes on a generic syntax that is immediately and non-ambiguously readable by any Lisper even without a priori knowledge of the arity of every operator that appears in a given domain. Lisp also allows for users to extend the language through reader-macros and macros should they want an ad-hoc syntax for their domain of choice. As for preferring a prefix syntax to a postfix syntax, macros have to be either prefix or parenthesized, and reader macros have to be prefix, so that even postfix languages such as FORTH or PostScript have special prefix syntax, and some equivalent of parentheses, although with semantic limitations (non-nestability, lack of proper scoping).

I speculate that much of the prejudice against parentheses can be traced down to the ambiguity of parentheses in C and similar syntaxes, and their general use as a marker for unusually complex pieces of code in most languages as well as in mathematical usage. People fluent in such languages and who face Lisp code once in a rare while may project such acquired emotions on what they read, despite these emotions being totally inappropriate in the context of Lisp syntax, where parentheses are a regular piece of structuring syntax, much like braces and semi-colons in C, or whitespace in Python. However, fluency with Lisp syntax is much easier and faster to acquire than fluency with C and C-like syntax. And then the emotion of parentheses in the context of Lisp becomes very similar to the emotion of whitespace in the context of Python.

Now, the same people who criticize the Lisp syntax accept without objection the syntax of the C programming language and similar or worse syntaxes such as those of C++, Java, Perl, etc. These syntaxes are presented as something one just has to learn and get used to, to be part of the elite of real programmers. Any criticism of these syntaxes gets the condescending reply to get over it and to accept the possibly admitted quirks the relevance of which is dismissed. Double standard? Certainly. The reproach concerning the syntax of Lisp is usually made by people who have seen it but never tried to use Lisp for anything meaningful. Those who have actually tried Lisp never criticize its syntax; if they try hard enough, they will soon have many valid reproaches to make to Lisp, but these reproaches all concern its semantics.

In the end, when people attack the syntax of Lisp, it is most usually but a rationalization without any technical merit regarding the syntax itself. This rationalization serves to cover up a defense mechanism against a foreign culture. It's a protective reflex against the cost of having to actually learn something new and different. The problem about Lisp here is not directly related to technical factors, but only to the fact that Lisp culture is not mainstream.

I do not want to deny that there are some technical factors that do influence the way that cultures gain, keep or lose influence. My point is that technical factors play no direct role in the measured reaction of aversion to Lisp syntax. It's just like people used to an alphabet, be it roman, cyrillic, arabic, hebrew, thai, or whatever, who will feel at unease when reading texts in a different alphabet that they are not fluent with (and then there are non-alphabetic writing systems such as Chinese characters). This unease doesn't imply any great truth about the relative superiority or inferiority of any writing system and the languages that use it, as compared to other alternatives; this doesn't preclude any such superiority or inferiority either, in any of infinitely many possible distinct relevant comparison scales.

As for non-technical factors that enter into play, and that may overwhelm any technical factor in the choice of a programming language (or of anything), I may invoke economical, social and political considerations. These considerations often have their own validity (information costs, transaction costs, network effects, established conventions, etc.), but they often may not. I claim that technically inferior solutions are often promoted based on wrong-headed considerations such as superstitions, obsolete traditions, irrational trends, ungrounded marketing, package-dealing, monopoly pressure, improper education, misevaluation of costs, political games, responsibility-avoidance, rush for what appears as an immediate solution, mindless copying, etc. (Will anyone step up to defend COBOL, FORTRAN or PL/I on intrinsic technical merit?) Does that mean that we must sit down and whine about the sorry state of the world? No. For in a free society, possessing an information that others do not possess is an economic advantage, a market opportunity. Now, of course, our societies may suffer a lot from not being free. But in as much as that's a problem worth acting upon, it is an altogether different problem that requires an action of its own; and I contend that we are free enough for leveraging the advantage of superior technological knowledge, as far as programming languages are concerned.

Tags: code evolution, dynamism, en, essays, lisp, meta, tao of programming
  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded