This short article discusses upcoming changes and future challenges for ASDF, the Common Lisp build system. It also draws lessons for a hypothetical successor to ASDF, for build systems in general, languages in which to write them, and languages that would have an internal build system that could rival with modern build systems.
ASDF, "Another System Definition Facility", is the de facto standard build system for Common Lisp (CL). It is relatively lightweight (13 kloc, over half of which for the portability layer UIOP, the "Utilities for Implementation- and OS- Portability"), quite portable (17 supported implementations), configurable (though importantly it "just works" by default), well-featured (it can create standalone executables), extensible (e.g. with support for linking C code, or for compiling FORTRAN through Lisp, etc.). But it lacks many features of modern build systems like e.g. Bazel: it does not support determinism and reproducibility, distribution and caching, cross-compilation to other platforms, building software written in languages other than CL, integration with non-CL build systems, management of multiple versions of the same software, or scaling to millions of files, etc. Historically, these limitations are due to ASDF being at heart an in-image build system in direct line of the original Lisp Machine DEFSYSTEM: it is designed to build and load software into the current Lisp image. But the challenges in possibly transforming ASDF into a modern build system touch limitations of Common Lisp itself and tell us something about language design in general.
I have essentially two development branches more or less ready for merge in the upcoming ASDF 3.3: the "plan" branch that provides proper phase separation (briefly discussed in my ELS 2017 demo), and the "syntax-control" branch that binding for syntax variables around ASDF evaluation (briefly discussed in my ELS 2014 extended article, section 3.5 "Safety before Ubiquity").
The first branch solves the problem of phase separation. The branch is called "plan" because I started with the belief that most of the changes would be centered around how ASDF computes its plan. But the changes run deeper than that: 970 lines were added or modified all over the source code, not counting hundreds more were moved around as the code got reorganized. That's double the number of lines of the original ASDF, and it took me several months (part time, off hours) to get just right. Still, it is up-to-date, passes all tests, and works fine for me.
To understand what this is about, consider that a basic design point in ASDF 1.0 to 3.2 is that it first plans your entire build, then it performs the plan. The plan is a list of actions (pair of OPERATION and COMPONENT), obtained by walking the action dependency graph implicitly defined by the COMPONENT-DEPENDS-ON methods. Performing the plan is achieved by calling the PERFORM generic function on each action, which in turn will call INPUT-FILES and OUTPUT-FILES to locate its inputs and outputs.
This plan-then-perform strategy works perfectly fine as long as you don't need ASDF extensions (such as, e.g. cffi-grovel, or f2l). However, if you need extensions, there is a problem: how do you load it? Well, it's written in Lisp, so you could use a Lisp build system to load it, for instance, ASDF! And so people either use load-system (or an older equivalent) from their .asd files, or more declaratively use :defsystem-depends-on in their (defsystem ...) form, which in practice is about the same. Now, since ASDF up until 3.2 has no notion of multiple loading phases, what happens is that a brand new separate plan is computed then performed every time you use this feature. This works well enough in simple cases: some actions may be planned then performed in multiple phases, but performing should be idempotent (or else you deserve to lose), therefore ASDF wastes some time rebuilding a few actions that were planned before an extension was loaded that also depended on them. However, the real problems arise when something causes an extension to be invalidated: then the behavior of the extension may change (even subtly) due to its modified dependency, and the extension and all the systems that directly or indirectly depend on should be invalidated and recomputed. But ASDF up until 3.2 fail to do so, and the resulting build can thus be incorrect.
The bug is quite subtle: to experience it, you must be attempting an incremental build, while meaningful changes were made that affect the behavior of an ASDF extension. This kind of situation is rare enough in the small. And it is easily remedied by manually building from scratch. In the small, you can afford to always build from scratch the few systems that you modify, anyway. But when programming in the large, the bug may become very serious. What is more, it is a hurdle on the road to making a future ASDF a robust system with deterministic builds.
Addressing the issue was not a simple fix, but required deep and subtle changes that introduce notions neglected in the previous simpler build models: having a session that spans multiple plan-then-perform phases and caches the proper information not too little not too much; having a notion that loading a .asd file is itself an action that must be taken into account in the plan; having a notion of dynamically detecting the dependencies of loading a .asd file; being able to check cross-phase dependencies before to keep or invalidate a previously loaded version of a .asd file without causing anything to be loaded in the doing; expanding the state space associated to actions as they are traversed potentially many times while building the now multi-phase dependency graph. And all these things interfere with each other and have to be gotten just right.
Now, while my implemented solution is obviously very specific to ASDF, the issue of properly staging build extensions is a common user need; and addressing the issue would require the introduction of similar notions in any build system. Yet, most build systems, like ASDF up until 3.2, fail to offer proper dependency tracking when extensions change: e.g. with GNU Make you can include the result of a target into the Makefile, but there is no attempt to invalidate targets if recipes have changed or the Makefile or some included file was modified. Those build systems that do implement proper phase separation to track these dependencies are usually language-specific build systems (like ASDF); but most of them (unlike ASDF) only deal with staging macros or extensions inside the language (e.g. Racket), not with building arbitrary code outside the language. An interesting case is Bazel, which does maintain a strict plan-then-perform model yet allows user-provided extensions (e.g. to support Lisp). However, its extensions, written in a safe restricted DSL (that runs into plan phase only, with two subphases, "load" and "analysis") are not themselves subject to extension using the build system (yet the DSL being a universal language, you could implement extensibility the hard way).
Fixing the build model in ASDF 3.3 led to subtle backward-incompatible changes. Libraries available on Quicklisp were inspected, and their authors contacted if they depended on modified functionality or abandoned internals. Those libraries that are still maintained were fixed. Still, I'd just like to see how compatible it is with next month's Quicklisp before I can recommend releasing these changes upon the masses.
The current ASDF has no notion of syntax, and uses whatever *readtable*, *print-pprint-dispatch*, *read-default-float-format* or many other syntax variables are ambient at the time ASDF is called. This means that if you ever side-effect those variables and/or the tables that underlie the first two, (e.g. to enable fare-quasiquote for the sake of matching with optima or trivia), then call ASDF, the code will be compiled with those modified tables, which will make fasl that are unloadable unless the same side-effects are present. If systems are modified and compiled that do not have explicit dependencies on those side-effects, or worse, that those side-effects depend on (e.g. fare-utils, that fare-quasiquote depends on), then your fasl cache will be polluted and the only way out will be to rm -rf the contaminated parts of the fasl cache and/or to build with :force :all until all parts are overwritten. Which is surprising and painful. In practice, this means that using ASDF is not compatible with making non-additive modifications to the syntax.
Back in the 3.1 days, I wrote a branch whereby each system has its own bindings for the syntax variables, whereas the default tables be read-only (if possible, which it is in many implementations). With that branch, the convention is each system can do modify the syntax in whatever way it wants, and that will only affect that system; however, changes to syntax tables must be done after explicitly creating new tables, and any attempt to side-effect the default global tables will result in an error.
This was the cleanest solution, but alas it is not compatible with a few legacy systems that explicitly depend on modifying the syntax tables (and/or variables?) for the next system to use, as ugly as that is. My initial opinion was that this should be forbidden, and that these legacy systems should be fixed; however, these were legacy systems at a notable Lisp company, with no one willing to fix them; also, I had resigned from maintainership and the new maintainer is more conservative than I am, so in the end the branch was delayed until after said Lisp company would investigate, which never happened, and the branch was never merged.
A simpler and more backward-compatible change to ASDF would have been to have global settings for the variables that are bound around any ASDF session. Then, the convention would be that you are not allowed to use ASDF again to load regular CL systems after you modify these variables in a non-additive way; and the only additive changes you can make are to add new entries to the shared *readtable* and *print-pprint-dispatch* tables that do not conflict with any default entry or earlier entry (and that includes default entries on any implementation that you may want to support, so e.g. no getting #_ or #/ if you want to support CCL). Even additive changes, if made, must somehow not clash with each other, or they become non-additive; but there is no way to automatically check that this is the case and issue a warning. After you make non-additive changes (if you do), then ASDF can't be used anymore to build normal systems that may conflict with those changes, and if they are modified and you call ASDF on a system that depends on them, you lose (or you must first make all those systems immutable).
Note that because ASDF would already break in those cases, most of these constraints de facto exist, are enforced, and are respected by all ASDF users. There remains the question of binding the variables around the build, which allows normal systems to be built even if a user changes the variables, or to not bind them, which puts the onus on most users of keeping these variables bound to reasonable values around calls to ASDF for the benefit of a few users would want their own breaking changes to persist after the build. I believe the first option (bind the variables) is cleaner, though the second (basically, do nothing) is more backward-compatible.
In all cases, you can always make non-additive changes to a readtable (such as enabling fare-quasiquote) by locally binding *readtable* to a different value, e.g. using named-readtables:in-readtable. A local binding won't adversely affect the ASDF build; but unless ASDF is changed to enforce its own bindings, you'll have to make sure to manually undo your local bindings before you call ASDF again.
The problem with not adding any syntax-control to ASDF is that it forces Lispers to always be conservative about modifying the readtable and calling ASDF (or having it called indirectly by any function whatsoever that they call, which they can't always predict). In practice this makes hacking CL code hostile to interactive development with non-additive syntax modification; which defeats in social conventions a technical feature of the language often touted as cool by its zealots. If syntax-control is added to ASDF, then you can freely do your syntax modifications and be confident that building code won't be adversely affected.
The current branch implements the simpler option of binding variables around ASDF sessions, and using a mutable shared readtable that should only be modified additively. It has probably bitrotten, and should be updated or rewritten. The current maintainer, Robert Goldman, should probably opine on which change to adopt with what schedule (3.3.0? 3.2.2? 3.3.1? 3.4.0?) and sign off the API.
Vanquishing Language Limitations
These two modifications are ((now)low)-hanging fruits in making ASDF a more robust build tool, one that supports working with non-trivial extension to the build system or the Lisp syntax. And in both cases, the limit reached by ASDF is ultimately that CL is a hippie language that allows unrestricted global side-effects and disallows disallowing. Therefore extensions necessarily introduce potential conflict with each other that have to be solved in wetware via convention, whereby all users are to be trusted not go wild with side-effects. The system cannot even detect violations and warn users of a potential mistake; users will have to experience subtle or catastrophic failure and figure out what went wrong.
A better language for a build system should be purer: inasmuch as it has "global" side-effects, it should allow to "fork" the "global" state in an efficient incremental way. Or even better, it should make it easy to catch side-effects and write this forking support in userland. At the very least, it would make it possible to detect violations and warn the user. Bazel is an example build system with an extension language that has local side-effects, but globally has pure forked environments. A successor to ASDF could similarly provide a suitably pure dialect of Lisp for extensions.
Happily, adding better syntax control to ASDF suggests an obvious solution: ASDF extensions could be written in an enforceable subset of a suitable extension of Common Lisp. Thus, ASDF extensions, if not random Common Lisp programs, can be made to follow a discipline compatible with a deterministic, reproducible build.
What would be an ideal language in which to write a extensible build system? Well, I tackled that question in another article, the Chapter 9: "Build Systems" of my blog "Ngnghm". That's probably too far from CL to be in the future of ASDF as such, though: the CL extension would be too large to fit ASDF's requirement of minimalism. On the other hand, if such a language and build system is ever written, interest for CL and ASDF might wane in favor of said latter build system.
In any case, in addition to not being a blub language, features that will make for a great programming language for an integrated build system include the following: making it possible to directly express functional reactive programming, determinism as well as system I/O, laziness as well as strictness, reflection to map variables to filesystem and/or version control as well as to stage computations in general including dynamic build plans, hygiene in syntax extension and file reference, modularity in the large as well as in the small, programmable namespace management, the ability to virtualize computations at all sizes and levels of abstractions, to instrument code, etc.
Now, before we get reproducible builds, we also need to enable cross-compilation for ASDF systems, so the necessarily unrestricted side-effects of compiling Common Lisp code cannot interfere with the rest of the build. Cross-compilation also allows building on a different platform, which would be important to properly support MOCL, but would probably also mesh well with support for building software in arbitrary other languages.
Importantly, instead of the (perform operation component) protocol that specifies how to build software in the current image, a (perform-form target operation component) protocol (or maybe one where the target information has been made part of the operation object) would return forms specifying how to build software, which could then happen in separate Lisp or non-Lisp process, on the same machine or on another worker of a distributed build farm.
Note however, that one essential constraint of ASDF is that it should keep working in-image in the small and not depend on external processes or additional libraries. Any serious effort towards a "deterministic build" should therefore remain an extension indeed (though one users would load early).
Still, if this extension is to remain compatible with ASDF and its .asd files, providing a backward-compatible path forward, then modifications and cleanups may have to be done to ASDF itself so it behaves well. Even keeping that hypothetical deterministic build separate, I expect non-trivial changes to the ASDF API to enable it, such as the perform-form protocol mentioned above. But backward-compatibility and smooth transition paths have always been the name of the game for ASDF; they are what make possible an ecosystem with thousands of packages.
There is a precedent to an ASDF extension leading to (most positive) changes in ASDF: POIU, the "Parallel Operators on Independent Units", Andreas Fuchs' extension to compile files in forks (but still load them in-image). Making sure that POIU can be expressed as an extension of ASDF without redefining or breaking the provided abstractions, was instrumental in the evolution of ASDF: it led to many cleanups in ASDF 2, it inspired several of the breakthroughs that informed what became ASDF 3, and it kept influencing ASDF 3.3.
Thus, even though ASDF will stay forever an in-image build system, and even though a deterministic build extension (let's call it FDSA, the Federated Deterministic System Assembler) may ultimately remain as little used as POIU (i.e. because it lacks sufficient benefits to justify the transition costs), I expect the design of the base ASDF to be deeply influenced by the development of such a tool (if it happens).
Looking for new developers
Robert Goldman and I are not getting younger, not getting more interested in ASDF, and we're not getting paid to hack on it. We are looking for young Common Lisp hackers to join us as developers, and maybe some day become maintainers, while we're still there to guide them through the code base. Even without the ambition (and resources) to actually work towards a hypothetical FDSA, our TODO file is full of items of all sizes and difficulties that could use some love. So, whatever your level of proficiency, if you feel like hacking on a build system both quite practical and full of potentiality, there are plenty of opportunities for you to work on ASDF (or a successor?) and do great, impactful work.