The first computer I used had about 2KB of RAM. The other day, I compiled a 2KB Common Lisp script into a 16MB executable to get its startup (and total execution) time down from 2s to subjectively instantaneous — and that didn't bother me the least, for my current computer has 8GB of working memory and over 100GB of persistent memory. But it did bother me that it didn't bother me, for 16MB was also the memory on the first computer in which I felt I wasn't RAM-starved: I could run an X server, an Emacs editor and a shell terminal simultaneously without swapping! Now an entire comfortable software development universe could be casually wasted over a stupid optimization — that I have to care about because software systems still suck. And to imagine that before sentientkind reaches its malthusian future, code bumming will have become a popular activity again...
Background (skip to the next paragraph if you don't care for hardware war stories):
I just returned my many-years old work laptop (a Lenovo Thinkpad X230),
because of various hardware issues I was starting to experience:
mostly a bad connection with the batteries at times causing the machine to shutdown at the least auspicious moment,
in addition to the traditional overheating and the
wifi card that often failed to connect requiring the
wpa_supplicant daemon to be killed.
I liked the Thinkpad form factor a lot, but my employer wasn't offering Thinkpad-s anymore,
so I opted instead for a slim HP EliteBook Folio 1040:
its form factor is obviously inspired from the macbook air,
except it is running a Linux system whereby I was master of my ship.
Now, the EliteBook has a touchpad that is particularly bad,
even worse than the Thinkpad's in being triggered all the time by my thumb as I type;
I decided to disable it immediately, just like I did eventually with the Thinkpad;
however, unlike the Thinkpad, the EliteBook doesn't have
a "clit" interface to supplement the touchpad.
Therefore I had to toggle the touchpad on and off instead of permanently disabling it.
A Google search quickly found a shell script to
toggle the touchpad,
and instructions on how to map
Penguin-Space to it
(the Penguin is Super,
much more so than the Windows it replaces).
But the shell script frankly made me puke,
and I decided to rewrite it in Lisp, which yielded
a very nice program less than 2KB long...
Well, I recently added an extra nail to the coffin, that addresses the remaining tradeoff
between startup times and memory occupancy:
it is now possible to easily share a dumped image between all the scripts you need,
to achieve instant startup without massive bloat of either working memory or persistent storage.
Admittedly, you could already do it semi-portably on SBCL and CCL using Xach's
but now you can do it fully portably on all implementations using
that you would use to invoke the program as a script.
The portable way to write a Common Lisp script is to use
typically via a
#!/usr/bin/cl script specification line when you're using Unix.
However, when launching a script this way,
even a relatively simple program can take one to several seconds to start:
the Lisp compiler locates and loads all the object files into memory,
linking the code into symbol, class and method tables;
and this somehow takes a non-negligible amount of time even when the files were precompiled,
because compilers were never optimized to make this fast;
indeed the typical Lisp hacker only recompiles and reloads one file at a time at his interactive REPL,
and doesn't often reload all the files from scratch.
By installing ASDF 3.1.4 over the one provided by SBCL using the provided install-asdf.lisp script, and by using the provided cl-source-registry-cache.lisp script to avoid search through my quite large collection of CL source code, I could get the startup time down to around .7 or .8s, but that was still too much.
This is fine for computation-intensive and/or long-running programs
for which this startup latency doesn't matter.
But that makes this solution totally impractical for interactive scripts
where execution latency is all-important,
as compared to other scripting languages, that while inferior as languages,
at least start up instantaneously in subjective time.
perl execute an empty command in about 5 ms of wall clock time,
python in about 18ms
(all timing and sizes rough averages and estimates on my current linux x86-64 laptop).
Without my portability infrastructure,
you can also do the same with
sbcl in 10 ms or
clisp in 15 ms,
but then you lose the portability and are either restricted to not using any software library,
or are back in non-portable configuration and compilation hell
in addition to having the same slow loading issue.
With such startup pause, Common Lisp might remain somewhat suitable to scripting,
unlike the vast majority of compiled programming languages,
that require a explicit compilation step
with non-trivial configuration of source and object files;
still it finds itself unsuitable for producing scripts destined for use as instantaneous interactive commands
outside its own autistic interactive development environment.
Now, all serious Common Lisp implementations also allow you to dump a memory image,
with all the code already loaded and linked, and such images start quite fast,
about 20 ms for a fully loaded image on
sbcl, about 35 ms on
and you can portably dump an image using my
by just adding
--output /path/to/executable --dump !
to the very same command you'd use to start a script.
Thus, at the expense of an extra but trivial build step that takes many seconds once,
you can portably transform your slow-starting scripts into a precompiled executable,
that will have startup time competitive with other scripting language,
and efficiency competitive with other compiled languages.
The problem is that such an image has a significant overhead in terms of space:
cl-launch program has an image
of size 13MB with CLISP, 28MB on CCL, or 52MB on SBCL
(which isn't that bad when you consider this contains the entire compiler and basic libraries
— GCC is bigger than that!);
an image with all the code I want loaded takes
27MB on CLISP, 50MB on CCL, 82MB on SBCL.
A poly-deca-megabyte image file is no big deal.
The biggest of these images is 1% of the memory of my laptop.
So, by today's standards, it's a small additive overhead.
But if you need one image per script, then 80MB of memory to execute a 2KB script
is a multiplicative factor 40000 in memory waste — and that is not acceptable
if like me you want to replace lots of small shell scripts with Common Lisp code.
Compare that to the incremental space expenditure for each additional 1KB of scripting code,
which is typically between 1KB and 10KB of additional size for the image,
a reasonable factor of 1 to 10.
This suggests an obvious solution:
to share the image-dumping expenditure between all your CL scripts,
so the space overhead is back to a negligible additive overhead and reasonable multiplicative factor,
instead of being an outrageous multiplicative factor.
made popular the old concept of a multi-call binary:
a same executable binary program that when executed behaves differently based on what name the program was called,
such that by using multiple symbolic links (or hardlinks) to the same program,
you can replace multiple different binaries with a single one,
benefitting from both the sharing effects of dynamic linking and the optimizations of static linking.
The same can be done for Common Lisp code.
already let you do that on SBCL then on CCL
using its option
I just enriched
cl-launch 4.1.2 to support the very same interface
on all its 12+ supported implementations
(well, the same interface, modulo a different treatment of corner cases).
Now, I already share the same executable for 7 personal use scripts,
and will only use CL for new scripts while slowly migrating all my old scripts.
[November 2015 update: now 44 personal scripts in a 95MB SBCL image that starts in 16ms]
The feature was a hundred lines of code total, including comments, documentation
and a new
the Lisp support for this feature is only loaded on-demand if you use
at which point it is marginally free to load a tiny additional ASDF system.
I love how Common Lisp lets me implement this feature in such a modular way.
Here is the documentation:
-DE --dispatch-entryis used, then the next argument must follow the format
NAMEis a name that the program may be invoked as (the basename of the
ENTRYis a function to be invoked as if by
--entrywhen that is the case. Support for option
-DE --dispatch-entryis delegated to a dispatch library, distributed with
cl-launchbut not part of
- registering a dependency on the dispatch library as if
--system cl-launch-dispatchhad been specified (if not already)
- if neither
--entrywas specified yet, registering a default entry function as if by
- registering an init-form that registers the dispatch entry as if
(cl-launch-dispatch:register-name/entry "NAME/ENTRY" :PACKAGE)had been specified where
PACKAGEis the current package. See the documentation of said library for further details.
Now, this is a great workaround, but doesn't fully solve the original issue. To completely solve it, an obvious strategy would be for some implementation to radically optimize loading of compiled objects (so called FASL files, for FASt Loading, which some jest should be renamed SLOw Loading), so it becomes actually fast. For instance, the compiler could produce a prelinked object that optimistically assumes it knows the load address, that there will be no conflict in symbol tables, class and method definitions, etc., and at runtime patches only a minimal set of pointers in the usual case. Doing it for 12+ implementations is not doable, but only one suffices, say SBCL or CCL. Alternatively, an "incremental image" feature might do, whereby one could dump all the symbols in some set of packages and not others, with associated functions, classes, etc.; it would require minor change in programmers' habits, though, so is less likely to happen. But any such complete solution will require hacking into the guts of a CL implementation, and that's no small undertaking.
Assuming we are not going to improve the underlying implementations, a more long-winded "solution" might be to extend the workaround until it becomes a solution: enabling the automatic sharing of executables between all the programs that matter. The old Common-Lisp-Controller from Debian could be resurrected, to create shared images and/or shared executables for software installed by the system's package manager; a similar mechanism could declaratively manage all the programs of a given user (possibly layered on top of the above when available). This might require some tweaks to ASDF so that it doesn't try to build pre-built software from system-managed directories using system-managed implementations, but compiles the usual way when there is a user-specified upgrade, the software wasn't built, or the implementation isn't system-managed. Importantly, there must not be an insecure writeable system-wide FASL cache. (i.e. reverting to per-user cache when any write access is required, or somehow talking to a trusted daemon to compile trusted sources with trusted compilers). This workaround through system management is somewhat ugly, though.
Note that these issues do not affect Common Lisp developers who run the functionality provided by these scripts from the Common Lisp REPL; they can already do that. It only affect users who run the functionality from these scripts from the shell command line or some other external non-Lisp programs. To a Common Lisp developer who needs such a use case, the solution to these issues is now trivial thanks to this new cl-launch feature. But these issues do make it hard for people to publish scripts that will "just work" for end-users — an end-user being someone who shan't be required to manage an installation or configuration step. These end-users will have to either suffer a multi-second pause at startup, or be burdened with a poly-deca-mega-byte executable for every script or set of related scripts they use. And so, the temporary conclusion is that while Common Lisp is in many ways far ahead of competition with respect to being a low-overhead "scripting language", it does at the moment have an issue putting it at a disadvantage against this competition in one crucial way with respect to deployment to end-users.