October 7th, 2014

eyes black and white

The Far Future of Programming: Ems

I had the privilege of reading a draft of Robin Hanson's upcoming book on ems: emulated brains, that with specialized hardware could possibly run thousands or millions of times faster than the actual brain they were once templated from. This got me thinking about what kind of programming languages these ems would use — though most arguments would also apply to any AI whether it is based or not based on such ems. And yes, there will be programming languages in such a future: predictable algorithmic tasks aren't going to write and deploy themselves — the buck has to stop with someone, and that someone will be a sentient being who can be held accountable.

When you're going at 10000 times the speed of a human, computers run relatively 10000 times slower in subjective time. Of course, an em could put itself in sleep mode and be swapped out of the compute cluster during any long computation, so that most computer interactions happen subjectively instantaneously even if they are actually very slow. An alarm on timeout would allow the em to avoid being indefinitely swapped out, at which point it could decide to resume, cancel or background the computation. Still, even if subjectively instantaneous, overly slow computations would disrupt the em's social, professional and personal life. Ultimately, latency kills you, possibly literally so: the latency may eat on the finite allowance of time during which your skills are marketable enough to finance your survival. Overly fast ems won't be able to afford being programmers; and there is thus a limit to how fast brain emulation can speed up the evolution of software, or any creative endeavours, really. In other words, Amdahl's Law applies to ems. So does Gustafson's Law, and programming em's will thus use their skills to develop greater artifacts than is currently possible.

Now, if you can afford to simulate a brain, memory and parallelism will be many orders of magnitude cheaper than they are now, in roughly inverse proportion to latency — so for fast ems, the price ratio between parallelism and latency will be multiplied by this factor squared. To take advantage of parallelism while minimizing latency, fast ems will thus use programming languages that are very terse and abstract, minimizing any boilerplate that increases latency, yet extremely efficient in a massively parallel setting, designed for parallelism. Much more like APL than like Java. Since running the code is expensive, bugs that waste programmer latency will be much more expensive than they are now. In some ways programmers will be experiencing echos of the bad old days of batch processing with punch cards and may lose the fancy interactive graphical interfaces of today — yet in other ways, their development environments will be more modern and powerful than what we use now. Any static or dynamic check that can be done in parallel with respectively developing or running the code will be done — the Lisp machine will be back, except it will also sport fancy static type systems. Low-level data corruption will be unthinkable; and even what we currently think of as high-level might be low-level to fast em programmers: declarative meta-programming will be the norm, with the computer searching through large spaces of programs for solutions to meta-level constraints — machine time is much cheaper than brain time, as long as it can be parallelized. Programmers will be very parsimonious in the syntax and semantics of their programs and programming languages; they will favor both high-falluting abstraction and ruthless efficiency over any kind of fanciness. If you don't grok both category theory and code bumming, you won't be the template for the em programmer of the future. Instead imagine millions of copies of Xavier Leroy or Edward Kmett or the most impressive hacker you've ever met programming in parallel — there will be no place for second rate programmers when you can instead hire a copy of the very best to use your scarce em cycles — only the best in their own field or combination of fields will have marketable skills.

At high-speed, though, latency becomes a huge bottleneck of social programming, even for these geniuses — and interplanetary travel will only make that worse. Bug fixes and new features will take forever to be published then accepted by everyone, and every team will have to develop in parallel its own redundant changes to common libraries: what to us are simple library changes to fast ems might be as expensive as agreeing on standard document is to us. Since manual merges of code are expensive, elaborate merge algorithms will be developed, programming languages will be modified if needed to make code merge easier. To reduce the number of conflicts, it will be important to canonicalize changes. Not only will each project have an automatically enforced Programming Style; copies of the very same maintenance ems will be present in every geographical zone to canonicalize bug fixes and feature enhancements of a given library. Software may therefore come with copies of the virtual wetware that is supposed to maintain the software — in a ready-to-code mood (or read-to-explain mood), in a fully deterministic internal state and environment, for complete reproducibility and debuggability. Canonicalization also allows for better caching of results when looking for otherwise expensive solutions to often-used problems.

Because programming itself can be parallelized by temporarily multiplying the number of ems, programming languages will be extremely composable. Modularity, separate compilation, advanced type systems and contract systems to specify interfaces, isolation through compile-time proofs, link-time enforcement or run-time virtualization, the ability to view the code as pure functional (with side-effects encapsulated in monads), etc., will be even more important than they are now. Expressiveness will also be very important to maximize what each worker can do; macros, dependent types, the ability to view the code in direct style (using side-effects and delimited continuations), etc., will be extremely important too. Development tools will manage the transformation back and forth between these two dual styles of viewing software. Thus armed with suitable composability, Conway's Law need not constrain software more than the fact that it's ultimately represented as an expression tree. What more, if the workers on each subexpression are forks of the worker on the top expression, there can be some coherence of design in the overall system over a very large system that currently would have required many programmers with different personalities. In this context, comments may be literally "leaving a note to yourself" — a parallel duplicate self instead of a sequential future self.

As programming is recursively divided into tasks, the programmer becomes his own recursive Mechanical Turk. There is an interesting issue, though, when additional requirements appear while trying to solve a subproblem that requires modifying a higher-level problem: if you let the worker who found and grokked the new requirement survive and edit the problem, this may create a bad incentive for workers to find problems so they may survive, and a problem of prioritizing or merging the insights of many parallel copies of the programmer who each found issues. It might be cheaper to have the subproblem workers issue an explanation for use by the superproblem worker, who will either send updates to other workers, or restart them with an updated subproblem specification. Ultimately, large teams of "the same" programmer mean that coordination costs will be drastically lower than they are currently. Software will thus scale in breadth vastly beyond what currently exists, though in depth it will still be limited to how much a single programmer can handle.

Because a same programmer is duplicated a lot of times, personalizations of the development environment that increase productivity have a vastly multiplied effect. Extreme customization, to the point of reimplementing tools in a style that suits the particular programmer, are to be expected. Because new copies of the same programmer when young can replace old copies that retire or graduate, there is no fear that a completely different person will have to be retrained on those quite personal tools. The newcomer will be happily surprised that everything is just where he wished for (except when some subtle and important constraint prevented it, that might be worth understanding), and that all source code he has to deal with fits his own personal programming style. Still, deliverables might have to be in a more neutral style if they are to be shared by multiple programmers with different personalities so that each domain is handled by the most proficient expert — or if they have to be assessed, security-checked, proven correct, etc., by a third party as part of accepting the software before it's deployed in a sensitive environment or duplicated zillions of time.

I am sure there is a lot more that can be foreseen about the far future of programming. As for the near future, it won't be quite so different from what we have now, yet I think that a few of the above points may apply as the cost of bugs increases, including the cost of a competent programmer relative to the size of the codebase.

PS: Robin Hanson is interested in reading more ideas on this topic and ems in general. If you share them soon enough, they may make it to the final version of his book.