A good programming language will have many libraries building on a small set of core features. Writing and distributing libraries is much easier than dealing with changes to a language implementation. Of course, the choice of core features affects the scope of things we can build as libraries. We want a very small core that still allows us to build anything.

The lambda calculus can implement any computable function, and encode arbitrary data types. Technically, it's all we need to instruct a computer. But programs also need to be written and understood by humans. We fleshy meatbags will soon get lost in a sea of unadorned lambdas. Our languages need to have more structure.

As an example, the Scheme programming language is explicitly based on the lambda calculus. But it adds syntactic special forms for definitions, variable binding, conditionals, etc. Scheme also lets the programmer define new syntactic forms as macros translating to existing syntax. Indeed, `lambda`

and the macro system are enough to implement some of the standard special forms.

But we can do better. There's a simple abstraction which lets us define `lambda`

, Lisp or Scheme macros, and all the other special forms as mere library code. This idea was known as "fexprs" in old Lisps, and more recently as "operatives" in John Shutt's programming language Kernel. Shutt's PhD thesis [PDF] has been a vital resource for learning about this stuff; I'm slowly making my way through its 416 pages.

What I understand so far can be summarized by something self-contained and kind of cool. Here's the agenda:

I'll describe a tiny programming language named Qoppa. Its S-expression syntax and basic data types are borrowed from Scheme. Qoppa has no special forms, and a small set of built-in operatives.

We'll write a Qoppa interpreter in Scheme.

We'll write a library for Qoppa which implements enough Scheme features to run the Qoppa interpreter.

We'll use this nested interpreter to very slowly compute the factorial of 5.

All of the code is on GitHub, if you'd like to see it in one place.

# Operatives in Qoppa

An operative is a first-class value: it can be passed to and from functions, stored in data structures, and so forth. To use an operative, you apply it to some arguments, much like a function. The difference is that

The operative receives its arguments as unevaluated syntax trees, and

The operative also gets an argument representing the variable-binding environment at the call site.

Just as Scheme's functions are constructed by the `lambda`

syntax, Qoppa's operatives are constructed by `vau`

. Here's a simple example:

```
(define quote
(vau (x) env
x))
```

We bind a single argument as `x`

, and bind the caller's environment as `env`

. (Since we don't use `env`

, we could replace it with `_`

, which means to ignore the argument in that position, like Haskell's `_`

or Kernel's `#ignore`

.) The body of the `vau`

says to return the argument `x`

, unevaluated.

So this implements Scheme's `quote`

special form. If we evaluate the expression `(quote x)`

we'll get the symbol `x`

. As it happens, `quote`

is used sparingly in Qoppa. There is usually a cleaner alternative, as we'll see.

Here's another operative:

```
(define list (vau xs env
(if (null? xs)
(quote ())
(cons
(eval env (car xs))
(eval env (cons list (cdr xs)))))))
```

This `list`

operative does the same thing as Scheme's `list`

function: it evaluates any number of arguments and returns them in a list. So `(list (+ 2 2) 3)`

evaluates to the list `(4 3)`

.

In Scheme, `list`

is just `(lambda xs xs)`

. In Qoppa it's more involved, because we must explicitly evaluate each argument. This is the hallmark of (meta)programming with operatives: we selectively evaluate using `eval`

, rather than selectively suppressing evaluation using `quote`

.

The last part of this code deserves closer scrutiny:

`(eval env (cons list (cdr xs)))`

What if the caller's environment `env`

contains a local binding for the name `list`

? Not to worry, because we aren't quoting the name `list`

. We're building a cons pair whose car is the *value* of `list`

... an operative! Supposing `xs`

is `(1 2 3)`

, the expression

`(cons list (cdr xs))`

evaluates to the list

`(<some-value-representing-an-operative> 2 3)`

and *that's* what `eval`

sees. Just like `lambda`

, evaluating a `vau`

expression captures the current environment. When the resulting operative is used, the `vau`

body gets values from this captured static environment, not the dynamic argument of the caller. So we have lexical scoping by default, with the option of dynamic scoping thanks to that `env`

parameter.

Compare this situation with Lisp or Scheme macros. Lisp macros build code which refers to external stuff by name. Maintaining macro hygiene requires constant attention by the programmer. Scheme's macros are hygienic by default, but the macro system is far more complex. Rather than writing ordinary functions, we have to use one of several special-purpose sublanguages. Operatives provide the safety of Scheme macros, but (like Lisp macros) they use only the core computational features of the language.

# Implementing Qoppa

Now that you have a taste of what the language is like, let's write a Qoppa interpreter in Scheme.

We will represent an environment as a list of frames, where a frame is simply an association list. Within the `vau`

body in

`( (vau (x) _ x) 3 )`

the current environment would be something like

```
( ;; local frame
((x 3))
;; global frame
((cons <operative>)
(car <operative>)
...) )
```

Here's a Scheme function to build a frame from some names and the corresponding values.

```
(define (bind param val) (cond
((and (null? param) (null? val))
'())
((eq? param '_)
'())
((symbol? param)
(list (list param val)))
((and (pair? param) (pair? val))
(append
(bind (car param) (car val))
(bind (cdr param) (cdr val))))
(else
(error "can't bind" param val))))
```

We allow names and values to be arbitrary trees, so for example

```
(bind
'((a b) . c)
'((1 2) 3 4))
```

evaluates to

```
((a 1)
(b 2)
(c (3 4)))
```

(If you'll recall, `(x . y)`

is the pair formed by `(cons 'x 'y)`

, an improper list.) The generality of `bind`

means our argument-binding syntax — in `vau`

, `lambda`

, `let`

, etc. — will be richer than Scheme's.

Next, a function to find a `(name value)`

entry, given the name and an environment. This just invokes `assq`

on each frame until we find a match.

```
(define (m-lookup name env)
(if (null? env)
(error "could not find" name)
(let ((binding (assq name (car env))))
(if binding
binding
(m-lookup name (cdr env))))))
```

We also need a representation for operatives. A simple choice is that a Qoppa operative is represented by a Scheme procedure that takes the operands and current environment as arguments. Now we can write the Qoppa evaluator itself.

```
(define (m-eval env exp) (cond
((symbol? exp)
(cadr (m-lookup exp env)))
((pair? exp)
(m-operate env (m-eval env (car exp)) (cdr exp)))
(else
exp)))
(define (m-operate env operative operands)
(operative env operands))
```

The evaluator has only three cases. If `exp`

is a symbol, it refers to a value in the current environment. If it's a cons pair, the car must evaluate to an operative and the cdr holds operands. Anything else evaluates to itself: numbers, strings, Booleans, and Qoppa operatives (represented by Scheme procedures).

Instead of the traditional eval and apply we have "eval" and "operate". Thanks to our uniform representation of operatives, the latter is very simple.

# Qoppa builtins

Now we need to populate the global environment with useful built-in operatives. `vau`

is the most significant of these. Here is its corresponding Scheme procedure.

```
(define (m-vau static-env vau-operands)
(let ((params (car vau-operands))
(env-param (cadr vau-operands))
(body (caddr vau-operands)))
(lambda (dynamic-env operands)
(m-eval
(cons
(bind
(cons env-param params)
(cons dynamic-env operands))
static-env)
body))))
```

When applying `vau`

, you provide a parameter tree, a name for the caller's environment, and a body. The result of applying `vau`

is an operative which, when applied, evaluates that body. It does so in the environment captured by `vau`

, extended with arguments.

Here's the global environment:

```
(define (make-global-frame)
(define (wrap-primitive fun)
(lambda (env operands)
(apply fun (map (lambda (exp) (m-eval env exp)) operands))))
(list
(list 'vau m-vau)
(list 'eval (wrap-primitive m-eval))
(list 'operate (wrap-primitive m-operate))
(list 'lookup (wrap-primitive m-lookup))
(list 'bool (wrap-primitive (lambda (b t f) (if b t f))))
(list 'eq? (wrap-primitive eq?))
; more like these
))
(define global-env (list (make-global-frame)))
```

Other than `vau`

, each built-in operative evaluates all of its arguments. That's what `wrap-primitive`

accomplishes. We can think of these as functions, whereas `vau`

is something more exotic.

We expose the interpreter's `m-eval`

and `m-operate`

, which are essential for building new features as library code. We could implement `lookup`

as library code; providing it here just prevents some code duplication.

The other functions inherited from Scheme are:

Type predicates:

`null?`

`symbol?`

`pair?`

Pairs:

`cons`

`car`

`cdr`

`set-car!`

`set-cdr!`

Arithmetic:

`+`

`*`

`-`

`/`

`<=`

`=`

I/O:

`error`

`display`

`open-input-file`

`read`

`eof-object`

# Scheme as a Qoppa library

The Qoppa interpreter uses Scheme syntax like `lambda`

, `define`

, `let`

, `if`

, etc. Qoppa itself supports none of this; all we get is `vau`

and some basic data types. But this is enough to build a Qoppa library which provides all the Scheme features we used in the interpreter. This code starts out very cryptic, and becomes easier to read as we have more high-level features available. You can read through the full library if you like. This section will go over some of the more interesting parts.

Our first task is a bit of a puzzle: how do you define `define`

? It's only possible because we expose the interpreter's representation of environments. We can push a new binding onto the top frame of `env`

, like so:

```
(set-car! env
(cons
(cons <name> (cons <value> null))
(car env)))
```

We use this idea twice, once inside the `vau`

body for `define`

, and once to define `define`

itself.

```
((vau (name-of-define null) env
(set-car! env (cons
(cons name-of-define
(cons (vau (name exp) defn-env
(set-car! defn-env (cons
(cons name (cons (eval defn-env exp) null))
(car defn-env))))
null))
(car env))))
define ())
```

Next we'll define Scheme's `if`

, which evaluates one branch or the other. We do this in terms of the Qoppa builtin `bool`

, which always evaluates both branches.

```
(define if (vau (b t f) env
(eval env
(bool (eval env b) t f))))
```

We already saw the code for `list`

, which evaluates each of its arguments. Many other operatives have this behavior, so we should abstract out the idea of "evaluate all arguments". The operative `wrap`

takes an operative and returns a transformed version of that operative, which evaluates all of its arguments.

```
(define wrap (vau (operative) oper-env
(vau args args-env
(operate args-env
(eval oper-env operative)
(operate args-env list args)))))
```

Now we can implement `lambda`

as an operative that builds a `vau`

term, `eval`

s it, and then `wraps`

the resulting operative.

```
(define lambda (vau (params body) static-env
(wrap
(eval static-env
(list vau params '_ body)))))
```

This works just like Scheme's `lambda`

:

```
(define fact (lambda (n)
(if (<= n 1)
1
(* n (fact (- n 1))))))
```

Actually, it's incomplete, because Scheme's `lambda`

allows an arbitrary number of expressions in the body. In other words Scheme's

`(lambda (x) a b c)`

is syntactic sugar for

`(lambda (x) (begin a b c))`

`begin`

evaluates its arguments in order left to right, and returns the value of the last one. In Scheme it's a special form, because normal argument evaluation happens in an undefined order. By contrast, the Qoppa interpreter implements a left-to-right order, so we'll define `begin`

as a function.

```
(define last (lambda (xs)
(if (null? (cdr xs))
(car xs)
(last (cdr xs)))))
(define begin (lambda xs (last xs)))
```

Now we can mutate the binding for `lambda`

to support multiple expressions.

```
(define set! (vau (name exp) env
(set-cdr!
(lookup name env)
(list (eval env exp)))))
(set! lambda
((lambda (base-lambda)
(vau (param . body) env
(eval env (list base-lambda param (cons begin body)))))
lambda))
```

Note the structure

`((lambda (base-lambda) ...) lambda)`

which holds on to the original `lambda`

operative, in a private frame. That's right, we're using `lambda`

to save `lambda`

so we can overwrite `lambda`

. We use the same approach when defining other sugar, such as the implicit `lambda`

in `define`

.

There are some more bits of Scheme we need to implement: `cond`

, `let`

, `map`

, `append`

, and so forth. These are mostly straightforward; read the code if you want the full story. By far the most troublesome was Scheme's `apply`

function, which takes a function and a list of arguments, and is supposed to apply the function to those arguments. The problem is that our functions are really operatives, and expect to call `eval`

on each of their arguments. If we already have the values in a list, how do we pass them on?

Qoppa and Kernel have very different solutions to this problem. In Kernel, "applicatives" (things that evaluate all their arguments) are a distinct type from operatives. `wrap`

is the primitive constructor of applicatives, and its inverse `unwrap`

is used to implement `apply`

. This design choice simplifies `apply`

but complicates the core evaluator, which needs to distinguish applicatives from operatives.

For Qoppa I implemented `wrap`

as a library function, which we saw before. But then we don't have `unwrap`

. So `apply`

takes the uglier approach of quoting each argument to prevent double-evaluation.

```
(define apply (wrap (vau (operative args) env
(eval env (cons
operative
(map (lambda (x) (list quote x)) args))))))
```

In either Kernel or Qoppa, you're not allowed to apply `apply`

to something that doesn't evaluate all of its arguments.

# Testing

The code we saw above is split into two files:

`qoppa.scm`

is the Qoppa interpreter, written in Scheme`prelude.qop`

is the Qoppa code which defines`wrap`

,`lambda`

, etc.

I defined a procedure `execute-file`

which reads a file from disk and runs each expression through `m-eval`

. The last line of `qoppa.scm`

is

`(execute-file "prelude.qop")`

so the definitions in `prelude.qop`

are available immediately.

We start by loading `qoppa.scm`

into a Scheme interpreter. I'm using Guile here, but I've actually tested this with a variety of R5RS implementations.

```
$ guile -l qoppa.scm
guile> (m-eval global-env '(fact 5))
$1 = 120
```

This establishes that we've implemented the features used by `fact`

, such as `define`

and `lambda`

. But did we actually implement enough to run the Qoppa interpreter? To test this, we need to go deeper.

```
guile> (execute-file "qoppa.scm")
$2 = done
guile> (m-eval global-env '(m-eval global-env '(fact 5)))
$3 = 120
```

This is factorial implemented in Scheme, implemented as a library for Qoppa, implemented in Scheme, implemented as a library for Qoppa, implemented in Scheme (implemented in C). Of course it's outrageously slow; on my machine this `(fact 5)`

takes about 5 minutes. But it demonstrates that a tiny language of operatives, augmented with an appropriate library, can provide enough syntactic features to run a non-trivial Scheme program. As for how to do this *efficiently*, well, I haven't got far enough into the literature to have any idea.