February 17th, 2010

eyes black and white

Interface-passing style

My local repository of XCVB is in a transient broken state in the beginning of a big refactoring. I've decided that it was easier to declare it broken, then take each file one by one in order, update it, add tests to it, and declare it fixed before I get to the next one.

But more interestingly, in my next rewrite of XCVB, I want to make wider use of a functional programming style and functional datastructures. However, I looked at the state of Common Lisp libraries, and could find precious little in terms of functional datastructures, except for FSet which comes with its own CL dialect which is more than I'm ready to deal with, yet without the level of control I desire.

And thus I rolled my own implementation of balanced binary trees, and published it as part of lisp-interface-library (formerly part of fare-utils). Actually, I'm so proud of the API I designed that I think it's worth a minor article at a Lisp conference, or at the very least a Lightning talk or two at the Boston Lisp Meeting.

  • The general idea is to systematically use composable first-class "interface" objects to drive the choice of algorithms.
  • These interface objects are typically used as first argument to any generic function in the API.
  • Interfaces can easily share common sub-algorithms with interface inheritance and interface mixins.
  • Interfaces can have slots that contain other interfaces to which they delegate some subproblems, and functions can build first-class interfaces by composing simpler interfaces into more elaborate ones, yielding parametric polymorphism.
  • Going from binary-trees to balanced binary-trees is literally one defmethod away, plus a trivial helper and two trivial defclasses. Very pleasant.
  • Algorithmic interfaces are decoupled from data structure constructors, and particularly from state management. Indeed, I'm mostly using constant algorithmic interfaces and pure functional datastructures. This is also very pleasant, unlike the horrors of standard OO style that conflate independent concerns of algorithm selection, data-structure and state encapsulation.
  • A very same datastructure can be viewed in different lights by being passed to the same function with different interfaces, and this makes it very easy to compose interfaces.
  • For instance, a pure hash table is a balanced tree of buckets, each bucket holding one (or at most a few) entries the hash values of which collide. This is trivially implemented by composing balanced trees and alists, so that the very same node can be viewed either as a hash-table of elements, or a tree of buckets.
  • Pure hash-table and alist interfaces are each parametrized by the interface that defines hash-function and equality on keys.
  • With this technique, you can have all the parametric polymorphism that Haskell type classes, ML functors, C++ templates, Java generic interfaces, PLT Scheme units, etc., provide, except without the aggravation or limitations of any of them, plus it fits nicely into the native Lisp style, and it takes advantage of CLOS expressiveness and optimizations.
  • While I was at it, I also implemented Big-Endian Patricia Trees aka "fast mergeable integer maps", with an additional obvious optimization for merging dense maps faster.
  • All my ventures with that interface passing style of programming were very pleasant. And I didn't feel like using a monad to pass interfaces around.

For instance, given an interface i that implements some kind of pure functional maps, you can lookup a key k in a map m with (lookup i m k) independently from that map being a function, tree, a hash-table, a trie, or whatever clever thing the user came up with. You can create interfaces from other interfaces, such as (<alist> eq:<equal>) which will create an interface for maps as association lists using equal as their equality predicate. Interfaces can take functions, numbers, etc., as parameters. You can implement order domains as parametrized by a string-lessp-like comparison predicate or by a strcmp-like function. You could have have n,m-Btrees, etc.

Note that "my" idea is not novel at all. Haskell typically implements type classes exactly this way: see for instance Implementing Type Classes (1993) by John Peterson and Mark Jones. Haskell has the advantage that it will automatically infer the type class for you and pass it around as an implicit argument. In a dynamic language like Common Lisp, you have to explicitly pass the interface around, which on the one hand is cumbersome, but on the other, gives you more control: you can express parametric polymorphism, existential quantification, dependent types, and multiple ways to view a very same data structure as being an instance of the same interface protocol, which is especially great when bootstrapping elaborate versions of a protocol from simpler versions of the same (as is done with hash-tables). Of course, using Haskell, you could also go from one point of view to the other by wrapping objects inside view-specific constructors; but then, if your algorithm switches point of view constantly as for a Necker Cube, you may waste a lot of space and possibly leak as you keep creating new wrapped objects; whereas with a first-class interface separate from the data, you can just switch interface without consing.

You could probably bootstrap a nice object system this way on a language that doesn't have objects, but using an existing object system to combine and compose interfaces actually helps a lot. Note that such interface objects are a great target for partial evaluation, hot spot optimization, etc. Happily, SBCL seems to be doing just that for you.

It's amazing the things you can do with proper use of a λ!

PS: see my ILC'2012 article LIL: CLOS reaches higher-order, sheds identity and has a transformative experience [source, HTML, PDF]