Thursday, February 2, 2012

Generating random functions

How can we pick a random Haskell function? Specifically, we want to write an IO action

randomFunction :: IO (Integer -> Bool)

with this behavior:

  • It produces a function of type Integer -> Bool.

  • It always produces a total function — a function which never throws an exception or enters an infinite loop.

  • It is equally likely to produce any such function.

This is tricky, because there are infinitely many such functions (more on that later).

In another language we might produce something which looks like a function, but actually flips a coin on each new integer input. It would use mutable state to remember previous results, so that future calls will be consistent. But the Haskell type we gave for randomFunction forbids this approach. randomFunction uses IO effects to pick a random function, but the function it picks has access to neither coin flips nor mutable state.

Alternatively, we could build a lazy infinite data structure containing all the Bool answers we need. randomFunction could generate an infinite list of random Bools, and produce a function f which indexes into that list. But this indexing will be inefficient in space and time. If the user calls (f 10000000), we'll have to run 10,000,000 steps of the pseudo-random number generator, and build 10,000,000 list elements, before we can return a single Bool result.

We can improve this considerably by using a different infinite data structure. Though our solution is pure functional code, we do end up relying on mutation — the implicit mutation by which lazy thunks become evaluated data.

The data structure

import System.Random
import Data.List ( genericIndex )

Our data structure is an infinite binary tree:

data Tree = Node Bool Tree Tree

We can interpret such a tree as a function from non-negative Integers to Bools. If the Integer argument is zero, the root node holds our Bool answer. Otherwise, we shift off the least-significant bit of the argument, and look at the left or right subtree depending on that bit.

get :: Tree -> (Integer -> Bool)
get (Node b _ _) 0 = b
get (Node _ x y) n =
case divMod n 2 of
(m, 0) -> get x m
(m, _) -> get y m

Now we need to build a suitable tree, starting from a random number generator state. The standard System.Random module is not going to win any speed contests, but it does have one extremely nice property: it supports an operation

split :: StdGen -> (StdGen, StdGen)

The two generator states returned by split will (ideally) produce two independent streams of random values. We use split at each node of the infinite tree.

build :: StdGen -> Tree
build g0 =
let (b, g1) = random g0
(g2, g3) = split g1
in Node b (build g2) (build g3)

This is a recursive function with no base case. Conceptually, it produces an infinite tree. Operationally, it produces a single Node constructor, whose fields are lazily-deferred computations. As get explores this notional infinite tree, new Nodes are created and randomness generated on demand.

get traverses one level per bit of its input integer. So looking up the integer n involves traversing and possibly creating O(log n) nodes. This suggests good space and time efficiency, though only testing will say for sure.

Now we have all the pieces to solve the original puzzle. We build two trees, one to handle positive numbers and another for negative numbers.

randomFunction :: IO (Integer -> Bool)
randomFunction = do
neg <- build `fmap` newStdGen
pos <- build `fmap` newStdGen
let f n | n < 0 = get neg (-n)
| otherwise = get pos n
return f

Testing

Here's some code which helps us visualize one of these functions in the vicinity of zero:

test :: (Integer -> Bool) -> IO ()
test f = putStrLn $ map (char . f) [-40..40] where
char False = ' '
char True = '-'

Now we can test randomFunction in GHCi:

λ> randomFunction >>= test
---- -   ---   -    - -   - --   - - -  -- --- -- --          - -- - - --  --- --
λ> randomFunction >>= test
-   ---- - - - -  - - -- -   -     ---  --- -- - --  -  --    - -  - - -  --   - 
λ> randomFunction >>= test
- ---  - - -  --  ---         -  --  -  -    -  -  - ---- - -  ---   -     -    -

Each result from randomFunction is indeed a function: it always gives the same output for a given input. This much should be clear from the fact that we haven't used any unsafe shenanigans. But we can also demonstrate it empirically:

λ> f <- randomFunction
λ> test f
-   -----  - -   -- - -   --- --  - -   - -   - -   -- - -   ---- - - - -  - --- 
λ> test f
-   -----  - -   -- - -   --- --  - -   - -   - -   -- - -   ---- - - - -  - --- 

Let's also test the speed on some very large arguments:

λ> :set +s
λ> f 10000000
True
(0.03 secs, 12648232 bytes)
λ> f (2^65536)
True
(1.10 secs, 569231584 bytes)
λ> f (2^65536)
True
(0.26 secs, 426068040 bytes)

The second call with 2^65536 is faster because the tree nodes already exist in memory. We can expect our tests to be faster yet if we compile with ghc -O rather than using GHCi's bytecode interpreter.

How many functions?

Assume we have infinite memory, so that Integers really can be unboundedly large. And let's ignore negative numbers, for simplicity. How many total functions of type Integer -> Bool are there?

Suppose we made an infinite list xs of all such functions. Now consider this definition:

diag :: [Integer -> Bool] -> (Integer -> Bool)
diag xs n = not $ genericIndex xs n n

For an argument n, diag xs looks at what the nth function of xs would return, and returns the opposite. This means the function diag xs differs from every function in our supposedly comprehensive list of functions. This contradiction shows that there are uncountably many total functions of type Integer -> Bool. It's closely related to Cantor's diagonal argument that the real numbers are uncountable.

But wait, there are only countably many Haskell programs! In fact, you can encode each one as a number. There may be uncountably many functions, but there are only a countable number of computable functions. So the proof breaks down if you restrict it to a real programming language like Haskell.

In that context, the existence of xs implies that there is some algorithm to enumerate the computable total functions. This is the assumption we ultimately contradict. The set of computable total functions is not recursively enumerable, even though it is countable. Intuitively, to produce a single element of this set, we would have to verify that the function halts on every input, which is impossible in the general case.

Now let's revisit randomFunction. Any function it produces is computable: the algorithm is a combination of the pseudo-random number procedure and our tree traversal. In this sense, randomFunction provides extremely poor randomness; it only selects values from a particular measure zero subset of its result type! But if you read the type constructor (->) as "computable function", as one should in a programming language, then randomFunction is closer to doing what it says it does.

Edit: See also Luke Palmer's recent article on this subject.

See also

The libraries data-memocombinators and MemoTrie use similar structures, not for building random functions but for memoizing existing ones.

You can download this post as a Literate Haskell file and play with the code.

1 comment:

  1. The set of total Haskell functions may not be recursively enumerable, but choose some total language (say, System F with structural recursion). The set of total functions in this language *is* recursively enumerable--generate a function and type check it. If it passes type checking, it's total.

    That is, we have

    interpret :: SystemFExpression -> Integer -> Bool
    generate :: Integer -> SystemFExpression

    However, we still haven't reached a contradiction; the diagonalization argument requires us to find an N such that (generate N) generates the function with source code:

    [code for interpret]
    [code for generate]
    f n = not (interpret (generate n) n)

    But any language that can implement its own interpreter isn't total!

    ReplyDelete