Illiterate Programming

Gilad Bracha

The text below was written some time ago, but I have not been blogging much and have let it languish. In the meantime, there has been some exciting activity in this field with the release of Eve. This is most encouraging, though I remain skeptical that the general programming population will take up literate programming; I'd be delighted to be proven wrong though.

1. Overview

Donald Knuth’s concept of literate programming has gained disappointingly little traction over the years. The idea that a program tells a story, a clear, well articulated story, elegantly presented, is lost on most programmers.

Here's an idea that is even more lost: telling such a story is valuable for more than teaching and documentation. You, the programmer/author, learn from the story telling, even if you are the only reader. As you tell the story you may well understand what's wrong with it.

And yet, very often programmers can hardly be bothered to write even basic comments in their code. Expert programmers usually disregard comments, because they are unlikely to be accurate. After all, if programmers don’t care to write comments in the first place, what are the odds that they will take the trouble to maintain them. Of course, this is both self-fulfilling prophecy and protective rationalization.

Changing this mindset is one Sisyphean task I am not about to undertake. Illiterate programming is the dominant paradigm and I fear it will remain so. So what is the point of discussing literate programming, let alone building software to support it?

One answer is that literate programming is about literature as well as programming. Knuth developed the concept when developing TeX and Metafont, which in turn were invented to support the writing of a book: The Art of Computer Programming. Literate programming is particularly good for writing about programs. Oddly, there are more people than ever writing about programs. Books, blogs, tutorials are proliferating like memory usage. So perhaps literate programming should focus on the literate programmers - those who write about (their) programs.

What should such a system look like? It has long been clear that one wants something more than the tools used to build the TeXBook. The benefits of a literate program grow the more interactive it is, the more one can experiment with it and play with it; the more live it is. So it is natural to seek live literate programming.

Ideally we'd want an interactive editor that smoothly integrated writing and all manner of interactive content. We already see interactive graphics in online newspapers and in technical e-books. Interactive fiction, where games and writing overlap, is another example. Tomas Petricek's thesis work is a nice example of academic writing benefiting from such an approach. Systems that support that sort of thing, done right, would also support live literate programming. The notebooks of Mathematica or Jupyter come to mind. I'll probably return to this topic in another post. For now, I'll consider a modest experiment.

We’ve been working on integrating Newspeak into Daan Leijen's Madoko. The document you are reading was created this way.

A literate programming system must provide a way for creating high quality documents. Madoko does that, allowing us to use LaTeX, Markdown and HTML in a useful mix. A literate programming system also needs to support programming. One has to be able to create and run programs.

Conveniently, a fragment of the Newspeak IDE runs in a web browser. This means that we can embed Newspeak development tools in HTML; and we can embed HTML in Madoko documents. Presto, we have the rudiments of a literate programming system for Newspeak. We'll call it Ampleforth, for obvious reasons.

Tangent: Ok, maybe they aren't so obvious. Ampleforth is a character in 1984 whose role is to convert works of Oldspeak literature into Newspeak. He's a literary editor conversant in Newspeak.

Below is a Newspeak evaluator. It's live: you can edit the expression and the result will recompute as you type. The result is printed out; note that the print out is a link. If you follow this link, it takes you to an inspector on the underlying object that was computed.

The object inspector is your gateway into the Newspeak IDE. You can see that 3+4 evaluates to 7; once you inspect 7, you can see that it is an instance of class Number. Click on Number and you enter a class presenter where you can browse its methods and link to its enclosing class.

You can also embed the IDE directly like so:

The above is the top level of the IDE; it provides links to the sources of all Newspeak module declarations in the system, via Newspeak sources. And it provides access to workspaces where you can evaluate code. If you click on the link labeled Workspaces you'll see the workspace manager, with an open workspace. You can evaluate code there. Type in the platform. The result is shown via a link below; if you follow the link you'll see an object inspector on the Newspeak platform, which is itself an object (chew on that for a while).

You can use this browser to explore the IDE itself. Click on the link labeled Inspect Presenter. Now you see an inspector on the IDE object that is presenting the platform object. Its class is shown to be ObjectPresenter. You can use that link to see the actual code used to present objects. In fact you can even change that code. The ability to change the code of the system you are using while using it is a key property that is lacking in most systems.

How does it work? Well, in the spirit of self application and dogfooding (it's a self-eating-dog-eat-self-eating dog world), let's explain Ampleforth using itself.

2. Newspeak LPs, the Short Version

In this section, I'll give a quick overview how Newspeak literate programs work. Some detail will be glossed over: how the IDE and GUI work in the browser, and what machinations are needed to handle the back button especially.

You'll need some basic familiarity with Newspeak syntax. This short tutorial (also written with Ampleforth) covers the basics.

2.1. Displaying Literate Programs

The Ampleforth class implements the display of literate programs.

Ampleforth is a Newspeak application, which just means it follows certain conventions to make it deployable. The convention you need to understand right now is that execution of a Newspeak app always starts with the method main:args:.

Open main:args:. You'll see that we start out by instantiating the class HopscotchIDE, which implements the subset of the IDE that runs on the web. Ampleforth then adds itself to the IDE namespace, so we can browse it from within itself. The next step is to instantiate the class Embedder, which is responsible for embedding Newspeak widgets into the text of the surrounding web page. We then send the resulting instance, embedder, the message start.

Now click on the home icon in the upper right of the Ampleforth class presenter. This will take you to the IDE entry point. Then follow the link labeled Newspeak Source. It shows us a list of all the module declarations in the system. Select AmpleforthEmbedder. You can see that it has a public method start which sends a series of messages - processEvaluators, processMinibrowsers and processClassPresenters. Let's look at processEvaluators. The code iterates over a collection of DOM elements returned by calling domElementsWithClass: 'evaluator'. As you'd expect, this gives us the set of DOM elements with a class attribute with value evaluator. We go through each such element and insert an EmbeddedHopscotchWindow into it, opened on an evaluator for the element's expression, as given in its expression attribute.

A similar process is used for processMinibrowsers and for processClassPresenters. You can check those out if you wish.

When a web page representing an LP starts up, it includes Ampleforth (more precisely, ampleforth.js, the translation of Ampleforth into Javascript). If you view the source of this page, you'll see the reference to the script at the very end of the source code. The script runs Ampleforth's main:args: method. As we just saw, this will have the effect of modifying the page's DOM to include Newspeak presenters for viewing the relevant code.

2.2. Creating Literate Programs

Now we can understand what Madoko has to do to help us create literate programs. We need to be able to embed HTML elements marked with class = "evaluator", with an expression attribute of our choice. The example we started with, 3+4, is represented as

<div class="evaluator" expression="3+4"> </div>

(if you check the source, you'll see some additional webcrap Madoko puts in as well). For presenters other than evaluators, we would similarly use divs with the appropriate class and attributes. To embed a top level IDE browser (aka Minibrowser) we'd write

<div class="minibrowser"> </div>

We also need to put a reference to ampleforth.js at the end of the HTML page produced. It's important that the script goes at the end of the page, so that all elements of class evaluator are already in the DOM when the script runs.

To get Madoko to do this for us, we put [INCLUDE=newspeak] at the top of our Madoko file, which is the source for the literate program. This tells Madoko that we are in newspeak mode. Newspeak mode is due to Daan Leijen, and it is full of Madoko magic that we won't discuss further in this section.

To make this work smoothly, we'd want to teach newspeak mode about these special divs, so we can easily insert them into our document. Otherwise we have to insert the HTML into the doc ourselves. Currently Madoko's newspeak mode only knows about evaluators.

All this is quite nice, but a good deal more needs to be done to create a workable system. The biggest hurdle is getting the full functionality of the Newspeak IDE to run in the browser. Minibrowser doesn't have the debugger, and lacks many other features like integrated version control, unit test support and so on. We know how to fix these things, but we haven't gotten round to actually doing so.

Of course, one need not implement Ampleforth on top of the web, but it does seem to be the most attractive means of widely distributing such live program documents.

Finally, the current approach has one glaring architectural flaw, but I won't tell you what it is. It should be obvious.