You are viewing fare

eyes black and white

Why Language Extensibility Matters

If you neglect some aspect of computation in designing your programming language, then the people who actually need that aspect will have to spend a lot of time dealing with it by themselves; and if you don't let them do it by extending your language, they'll have to do it with inarticulate barkings rather than civilized sentences.

So you think you can overlook such facts of life as having to support multiple concurrent isolated virtual worlds on a same machine, communicating with such other worlds on other machines? Forget to specify how concurrent activities can co-exist at all? Leave out persistence and retrieval of data outside of your language specification? Omit ways to interface directly with existing libraries and external programs? Or just do a shoddy job at it that is not robust or comprehensive enough to sustain serious use in software that matters to programmers?

Then programmers will retrofit into your language such crocks as threads, whereby user code may arbitrarily break the system invariants of your otherwise safe language (if at all). They will resort to brittle external tools such as shell scripts to bind together the pieces of your system. They will curse you as they scramble to reimplement the interfaces you left out, in as many different and differently fragile ways. They will waste countless months rebuilding infrastructure to properly talk to databases, and get crazy having to deal with persistence, transactionality and schema upgrades through manually coded conventions rather than automatically enforced mechanisms.

Now there are infinitely many possible aspects to programming, new ones being invented everyday and becoming popular every year; as a programming language designer, you can't be reasonably expected to have already provided a well-designed interface to all of them in your programming language. Handling all possible aspects of programming that anyone will ever imagine to care for not only requires more resources than you can ever provide, but also a wilder imagination than all you can ever have. And so, you can never design a system that does everything for everyone, all the time. Therefore there will be (infinitely) many things that your computing system won't be able to fully handle; even for the finitely many things that you handle, you will find that you lack the resources to update and maintain your system so it keeps handling those features as the related needs evolve.

But then, you should at least admit defeat, and do not try to fake having a solution: give your users direct access to the guts of the system with which they can build actual solutions, instead of providing any misdesigned partial "solutions" that can't possibly be made to fit their needs. Any half-assed pseudo-solution you offer will be treated as damage and be routed around. Programmers who need to get things done will bypass your puny file-system abstractions and directly use low-level system calls interfaces -- or they will just prefer better languages that don't introduce any superfluous impedance mismatch between them and the realities they have to deal with. Until of course such languages degenerate to the point that they do introduce an impedance mismatch once again! Which might be but a minor nuisance, or convince users to fork your language or migrate away, depending on how much that matters to them.

Your system can't be the be-all end-all for everyone all the time. That is why it is very important that your system should be extensible. Now, some systems are only extensible because the source code is available. That's already infinitely better than systems where the source code isn't. But one can do much better: you can design your system so that it can be extended from within, with some macro system as in Lisp or Scheme, or some grammar extension system as with camlp4 or pepsi.

The original Lispers had that principle that there should not be any system programmer privilege: any user of the language or system should be able to do anything that the system or language implementor can do. For instance, whereas in Pascal there was this magic multi-argument PrintLn procedure when all user-defined procedures had a fixed number of arguments, Lisp goes through great length so that all the magic available in any of its functions or special forms could just as well be reimplemented or done differently by language users with the available primitives. You can implement your own elaborate object system on top of the language, and that's even how the reference implementation of the Common Lisp Object System is itself written; that's also how in practice users implement say an automatically prefixing variant of with-accessors with the help of symbol-macros and/or setf-expanders.

Now when everything can be defined and redefined, there is the worry that you can't trust anything, because the most basic things might have been redefined. The worry is actually quite legitimate, as interesting malicious Javascript cross-site scripts have shown. But the solution is not to prevent language modification; the solution is to allow the user to abstract over and control which modified language variant he is using. Thus, anyone should be able to easily define a new language, create compilers, analyzers and processors for such a language, write programs in that language or program-generators targetting it. Any invocation of a program would make include the implicit or explicit specification of an appropriate processor, with sane defaults that can't cause unrelated programs to go wrong. This the Common Lisp community failed to do so far with its paradigm of side-effecting the read-time or compile-time environments. But PLT Scheme handles such language abstraction like a charm by letting each module declare what language it is written in.

To paraphrase Hayek, IndividuaLisp is thus an attitude of humility before the program languaging process and of tolerance to other opinions, and is the exact opposite of that intellectual hubris which is at the root of the demand for comprehensive direction of the program languaging process.

Comments

(Anonymous)

Award: Most Worthless Blog Post of the Day

You scored high for:
* Saying absolutely nothing
* Wasting my time
* Posting when you shouldn't have
* Pretending the old is new again
* Having a fascist comment moderation system
* Repeating tired lisp dogma yet ignoring that any extension to a Lisp system is not easily handled by tools (FYI your tool has to parse Turing complete macros, THAT'S HARD).

Dealing with language extensions

It is certainly hard to deal with language extensions, but

1- PLT Scheme for instance does it quite well, allowing you to debug in your actual source language and even graphically browse various code analyses despite the fact that it internally works on the fully macroexpanded code.

2- A lot of tractable techniques for code generation have been designed that tremendously simplify the job of automating analysis of code extensions: Macro Hygiene, Higher-Order Abstract Syntax, Type Systems for Staged Computation, etc.

3- For all the warts of macro-expansion, the alternative is much worse: having to deal with manually expanded "design patterns", deal with the issues of bugs in manual expansion, duplication from copy-paste, de-coherence of copies when the pattern changes, human problems when the modification in "patterns" require cooperation on two sides of a interface, and the reverse-engineering of it all if you are to do any manual or automatic "refactoring". Good luck.

Of course, none but the most basic techniques are available in standard Common Lisp -- but no one ever claimed that to be the be-all end-all.

(Anonymous)

Re: Award: Most Worthless Blog Post of the Day

The most extensible language ever was Forth; Forth gives you the ability to run code at compile time, primitives to steal the source stream from the compiler, suck in your own syntax, and generate your own code. I wrote a PostScript interpreter in a Forth like language and implemented the following code:

Check( Number Number --> Number )

The compiler in this case is named "Check(". Yes, that's right, the paren was part of the compiler's name. It parsed up to the closing paren, and generated Assembly code to validate that two numbers were to be given on the data stack, Checked for stack underflow, and required that a number be left on the data stack by the following code.

I often miss programming in that world, the highly interactive, write only language that Forth can be. Oh well.... THAT kind of extensibility just isn't allowed in modern languages...

(Anonymous)

Your language link was interesting (I thought my browser was buggy when half the word was colored as visited), but you should update it to (at least) [python][ruby] not [obsolete-language-whos-name-I-wont-mention][python] as it is now.

Weak argument

Extensibility is very important, but you make a very weak argument for macros. Python lacks macros entirely, so shouldn't it suck?

Also, wouldn't it be that much better to alter the parser itself, defining syntax that really matches what you need? Sure, there's more to learn when switching from language to language, but large scale macro systems (such as CLOS) have that problem anyway.

Impedence mismatch

Minor nitpick: Python 3.0 didn't introduce the impedance mismatch — it was there all along. The user wants text and the OS only handles bytes. Only if you eliminate the user can you remain blissfully ignorant and only handle bytes.

Python 3.0's defaults simply make having a user much easier.

Other things matter too

You're not convincing me with all that rambling :-)

I like the idea of implementing Python in Lisp, using macros behind the scenes, making Python extensible in Lisp and vice versa. That didn't make sense two decades ago when Python was begun... can you imagine writing it in Common Lisp at the time? It would've been an academic curiosity... and I'd still be using C.

Python got where it is by focusing on practical concerns first, with gradual refinements to make it more regular and extensible, more Lisp-like. Python 3.0 finally has sensible lexical scope rules, which Scheme had 30 years ago. Conversely, PLT Scheme 4.0 finally has a module system on par with Python's. It's nice to see both camps learning from one another. Of course, I use Python for real work. Being part of a cohesive community has its benefits.
eyes black and white

June 2014

S M T W T F S
1234567
891011121314
15161718192021
22232425262728
2930     

Tags

Powered by LiveJournal.com