# NP is the new P

In school I learned that NP-complete problems are intractably hard, chief among them Boolean satisfiability (SAT). My mind was thoroughly blown when I heard about SAT solvers and their success at a wide variety of real-world problems (1, 2, 3, 4). Apparently NP is the new P.

Setting aside the theory for a moment, what this implies is the feasibility (for some problems) of a new style of programming based on solving logical equations. We can describe a problem in terms of requirements for the solution, and leave the task of actually finding that solution to a highly-tuned general-purpose library.

There are quite a few SAT-solver libraries on Hackage. I tried some of them on some toy projects. I ran into a number of frustrations, as well as some code that just plain didn't work.

So I decided to write my own binding to Yices. Yices is an efficient and flexible solver for "SAT modulo theories" (SMT). This means that it understands not only Boolean logic but also types like integers, bit vectors, functions, etc. Unfortunately it's not open-source, but it does support a variety of interfaces to external code. The raw C API in bindings-yices was a great starting point, and today I'm releasing yices-easy 0.1.

yices-easy is not the first Yices binding on Hackage (1, 2, 3), nor will it be the last. It's not feature-complete, nor heavily optimized. The goal is simply to lower the entry barrier for Haskell developers (in particular, me) to learn about and use this powerful alien technology.

# Latin squares

Let's see an example! A Latin square is an n × n matrix where each row and column is a permutation of `[1..n]`. We can express these constraints directly using Yices's integer variables:

``import Yices.Easyimport Yices.Easy.Sugarimport Yices.Easy.Buildimport Control.Monad ( forM_, liftM2 )import qualified Data.Map as Mcell :: (Int,Int) -> Stringcell (x,y) = concat ["c", show x, "_", show y]query :: Int -> Queryquery n = execBuild \$ do  let cells = liftM2 (,) [1..n] [1..n]  forM_ cells \$ \c -> do    x <- declInt \$ cell c    assert ((x >=. 1) &&. (x <=. fromIntegral n))  forM_ cells \$ \c@(i0,j0) -> do    let notEq c1 = assert (Var (cell c) /=. Var (cell c1))    forM_ [i0+1..n] \$ \i -> notEq (i, j0)    forM_ [j0+1..n] \$ \j -> notEq (i0,j )run :: Int -> IO ()run n = do  Sat model <- solve \$ query n  let soln c = case M.lookup (cell c) model of Just (ValInt k) -> k      line i = forM_ [1..n] \$ \j -> putStr (show (soln (i,j)) ++ " ")  forM_ [1..n] \$ \i -> line i >> putChar '\n'``

Testing it:

``````GHCi> run 9
4 2 6 8 5 1 7 9 3
1 9 7 3 8 6 4 2 5
5 3 9 1 4 7 6 8 2
3 4 5 2 7 9 8 1 6
7 5 8 6 9 2 3 4 1
2 8 1 5 6 4 9 3 7
6 7 4 9 1 3 2 5 8
8 6 2 4 3 5 1 7 9
9 1 3 7 2 8 5 6 4
``````

This takes about 2 to 4 seconds on my laptop. Yices is nondeterministic, so you can get several Latin squares by running the solver repeatedly.

The function `query` builds a Yices `Query`. This is a totally first-order value that you can build and inspect in pure Haskell. We're using some infix syntactic sugar and some monadic bookkeeping, but these are not essential parts of the library. At the core, we just build a Yices syntax tree and hand it off to `solve`, which translates it to Yices's internal data structures and invokes the solver.

Our query consists of two parts. First we declare an integer-valued variable for each cell in the Latin square, and constrain them to take on values from `[1..n]`. Then we constrain each cell to differ from those directly below or to the right. `run` invokes the solver, then produces formatted output.

# Graph coloring

A classic NP-complete problem is graph coloring. Given is a graph of nodes and edges, and a set of k colors. We must pick a color for each node, such that nodes connected by an edge never have the same color. Again, we can express this directly in Yices:

``import Yices.Easyimport Yices.Easy.Sugarimport Yices.Easy.Buildimport Control.Monadimport Systemimport Data.Listimport qualified Data.Map      as Mimport qualified Data.GraphViz as Gtype Edge  = (Ident, Ident)data Graph = Graph [Ident] [Edge]parse :: String -> Graphparse g = Graph vs es where  es = map ((\[x,y] -> (x,y)) . words) \$ lines g  vs = nub \$ concat [ [x,y] | (x,y) <- es ]query :: Int -> Graph -> Queryquery n (Graph vs es) = execBuild \$ do  forM_ vs \$ \v -> do    x <- declInt v    assert ((x >=. 0) &&. (x <. fromIntegral n))  forM_ es \$ \(x,y) -> assert (Var x /=. Var y)render :: Graph -> Model -> Stringrender (Graph vs es) m = G.printDotGraph g where  g   = G.DotGraph False False Nothing \$ G.DotStmts gbl [] vss ess  gbl = [G.NodeAttrs [G.Style [G.SItem G.Filled []]]]  vss = [G.DotNode v [G.Color [G.X11Color \$ color v]] | v <- vs]  ess = [G.DotEdge x y False [] | (x,y) <- es]  colors = [G.Red, G.Green, G.Blue, G.Cyan, G.Magenta, G.Yellow, G.White]  color v = case M.lookup v m of Just (ValInt i) -> colors !! fromIntegral imain :: IO ()main = do  [nx,file] <- getArgs  numColors <- readIO nx  graph     <- parse `fmap` readFile file  result    <- solve \$ query numColors graph  case result of    Sat model -> writeFile "out.dot" \$ render graph model    _         -> putStrLn "No solution." >> exitFailure``

Unsurprisingly, most of this code is input / output glue. In `parse` we parse a very simple edge-list format (see below). `query` builds the Yices query and is quite similar to the previous example. (In fact, Latin squares are the colorings of a rook's graph.) `render` uses the graphviz library to build a colorful graph, which we can render with `dot`.

Let's color the Petersen graph:

``````\$ cat petersen.txt
Ao Bo
Bo Co
Co Do
Do Eo
Eo Ao
Ao Ai
Bo Bi
Co Ci
Do Di
Eo Ei
Ai Ci
Bi Di
Ci Ei
Di Ai
Ei Bi
\$ ghc --make -O graph-color.hs
\$ export LD_LIBRARY_PATH=~/local/lib
\$ ./graph-color 2 petersen.txt
No solution.
\$ ./graph-color 3 petersen.txt
\$ dot -Tpng out.dot > out.png
\$ xview out.png
``````

The coloring works! Too bad Graphviz can't recognize the striking symmetry in the Petersen graph.

Note that I had to specify `LD_LIBRARY_PATH` because `libyices.so` is not installed globally on my system. If that's the case for you, you will also need to pass `--extra-include-dirs=PATH` and `--extra-lib-dirs=PATH` options to `cabal` when you install `bindings-yices`, which is used by my library.

# What next?

So that's yices-easy. It's a young library; I need people to bang on it and find bugs. What can you do with an SMT solver?

## Sunday, September 26, 2010

### View patterns for validation

GHC's `ViewPatterns` extension has a lot of unexpected uses. One that I've found recently is input validation.

``{-# LANGUAGE ViewPatterns #-}import Control.Monadimport Text.Printfmonths :: [String]months = words "Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec"``

Using this function:

``range :: (Ord a) => a -> a -> a -> arange lb ub x  | (x < lb) || (x > ub) = error "argument out of range"  | otherwise = x``

we can put bounds on arguments:

``month :: Int -> Stringmonth (range 1 12 -> m) = months !! (m-1)``

like so:

``````GHCi> month 3
"Mar"
GHCi> month 15
"*** Exception: argument out of range
``````

This version provides better error messages:

``-- V for 'verbose'rangeV :: (Ord a, Show a) => String -> a -> a -> a -> arangeV fun lb ub x  | (x < lb) || (x > ub) = error (printf msg fun (show x) (show lb) (show ub))  | otherwise = x  where msg = "%s: argument %s is outside of range [%s,%s]"``

like so:

``monthV :: Int -> StringmonthV (rangeV "monthV" 1 12 -> m) = months !! (m-1)``
``````GHCi> monthV 3
"Mar"
GHCi> monthV 15
"*** Exception: monthV: argument 15 is outside of range [1,12]
``````

``-- Mp for MonadPlusrangeMp :: (MonadPlus m, Ord a) => a -> a -> a -> m arangeMp lb ub x = guard ((x >= lb) && (x <= ub)) >> return x``

like so:

``monthMaybe :: Int -> Maybe StringmonthMaybe (rangeMp 1 12 -> Just m) = Just (months !! (m-1))monthMaybe _ = Nothing``
``````GHCi> monthMaybe 3
Just "Mar"
GHCi> monthMaybe 15
Nothing
``````

## Saturday, September 25, 2010

### GHCi and Cabal

Maybe I'm the only one who didn't already know this, but I'd like to describe how to use GHCi within a Cabal project under development.

Say I'm working on a library `foobar`:

``````~/foobar \$ find -type f
./foobar.cabal
./Data/FooBar.hs
./Setup.hs
``````

I want to write a test program, which won't go in the Cabal distribution:

``````~/foobar \$ cat > test.hs
import Data.FooBar
main = print fooBar
^D

~/foobar \$ cabal configure && cabal build
Resolving dependencies...
Configuring foobar-0.1...
Preprocessing library foobar-0.1...
Building foobar-0.1...
Registering foobar-0.1...

~/foobar \$ ghci test.hs
GHCi, version 6.12.1: http://www.haskell.org/ghc/  :? for help
[1 of 2] Compiling Data.FooBar      ( Data/FooBar.hs, interpreted )
[2 of 2] Compiling Main             ( test.hs, interpreted )
``````

GHCi is loading `Data.FooBar` into the interpreter, but I wanted to test the compiled, optimized version that was just produced by Cabal. Let's try telling it about the `dist/` directory:

``````~/foobar \$ ghci test.hs -package-conf dist/package.conf.inplace
GHCi, version 6.12.1: http://www.haskell.org/ghc/  :? for help
[1 of 2] Compiling Data.FooBar      ( Data/FooBar.hs, interpreted )
[2 of 2] Compiling Main             ( test.hs, interpreted )
``````

Still no good. Seems that the existence of the file `Data/FooBar.hs` overrides anything in the package database. Move to another directory:

``````~/foobar      \$ mkdir test && cd test
~/foobar/test \$ mv ../test.hs .
~/foobar/test \$ ghci test.hs -package-conf ../dist/package.conf.inplace
GHCi, version 6.12.1: http://www.haskell.org/ghc/  :? for help
[1 of 1] Compiling Main             ( test.hs, interpreted )
*Main> fooBar
"testing 123"
``````

It works. :) For compiled code as well:

``````~/foobar/test \$ ghc --make -O test.hs -package-conf ../dist/package.conf.inplace
[1 of 1] Compiling Main             ( test.hs, test.o )
~/foobar/test \$ ./test
"testing 123"
``````

## Friday, September 24, 2010

### clogparse: parsing #haskell IRC logs

The `#haskell` IRC channel on Freenode has archived logs going back to late 2001. That's a lot of data: over 10 million lines, almost 700 MB uncompressed. There's all kinds of clever analyses we could do, and I think I know what language we'd like to use.

So I wrote clogparse, a library for parsing these log files. Let's see who `lambdabot` has attacked this year:

``import Data.IRC.CLog.Parseimport System.Environment ( getArgs )import qualified Data.Text    as Textimport qualified Data.Text.IO as Textmain :: IO ()main = getArgs >>= mapM_ (\c -> do  es <- parseLog haskellConfig c  let m = Nick \$ Text.pack "lambdabot"  mapM_ Text.putStrLn [ t | EventAt _ (Act n t) <- es, n == m ])``

Running it:

``````\$ ghc --make -O -Wall test.hs
[1 of 1] Compiling Main             ( test.hs, test.o )

\$ ./test logs/10.* +RTS -A400M
hits lunabot with an assortment of kitchen utensils
will count to five...
would never hurt Cale!
pulls copumpkin through the Evil Mangler
throws some pointy lambdas at medfly
slaps Jafet
puts on her slapping gloves, and slaps cads
hits Twey with an assortment of kitchen utensils
pushes c_wraith for using fromJust from his chair
...
``````

This is allocation-heavy code. I increased the allocation area size with `+RTS -A400M`, which significantly improves performance.

Of course it's hard to beat `grep` for something so simple. Let's next look at when people join the channel throughout the day:

``import Data.IRC.CLog.Parseimport System.Environmentimport Data.Listimport Data.Timeimport Text.Printfimport Control.Concurrent.Spawnimport qualified Data.IntMap as M-- Map times to 10-minute buckets.bucket :: UTCTime -> Intbucket (UTCTime _ dt) = floor (toRational dt / 600)unbucket :: Int -> Stringunbucket n = printf "%02d:%02d" d (m*10) where  (d,m) = divMod n 6-- For use with 'alter'; increments an IntMap key strictly.inc :: (Num a) => a -> Maybe a -> Maybe ainc x Nothing  = Just xinc x (Just n) = Just \$! (n+x)-- Get a count of 'Join' events per time bucket for one file.get :: FilePath -> IO (M.IntMap Int)get p = do  es <- parseLog haskellConfig p  let bs = [ bucket t | EventAt t (Join _ _) <- es ]  return \$! foldl' (flip . M.alter \$ inc 1) M.empty bsmain :: IO ()main = do  p  <- pool 8  -- to avoid exhausting memory  ms <- getArgs >>= mapM (spawn . p . get) >>= sequence  -- IntMap lacks a strict 'unionWith', so fold over assocs.  let acc = foldl' (\h (k,v) -> M.alter (inc v) k h) M.empty      m   = acc \$ concatMap M.assocs ms  -- print (time, count) in gnuplot-friendly format  mapM_ (uncurry \$ printf "%s %8d\n")    [ (unbucket k, v) | (k,v) <- M.assocs m ]``

We count how many join events occur in each ten-minute interval of the day. We parse in a separate thread for each file, then consolidate the results. Spawning more than 3,000 threads is normally not a problem, but we don't want to load every log file into memory at once. We'd like to have the simplicity of one thread per file, but keep some of them blocked waiting for others to finish. That's exactly what the new `pool` function in spawn 0.2 does.

Let's see the results:

``````\$ ghc --make -O -Wall -threaded histogram.hs
\$ ./histogram logs/* +RTS -N2 > hist.dat
\$ gnuplot
gnuplot> set xdata time
gnuplot> set timefmt "%H:%M"
gnuplot> set format x "%H:%M"
gnuplot> plot 'hist.dat' using 1:2 with steps
``````

And we get this graph:

(Modulo some extra `gnuplot` tweaking on my part. Click for big.)

Times are labeled in UTC. You can sort of make sense of this data: there's a big peak around 09:10, at the beginning of the day in Europe, and some others later on, maybe from the USA. There's another big peak at 23:00; why? How much of this is due to particular institutions, netsplits, etc?

Anyway, that's a couple of small examples of what this library might be good for. I'm looking forward to seeing some clever applications in the future.

## Thursday, September 23, 2010

### Executing a ByteString

Why not!

``{-# LANGUAGE ForeignFunctionInterface #-}import Foreign ( FunPtr, withForeignPtr, castPtrToFunPtr )import System  ( getArgs )import qualified Data.ByteString as B ( pack, ByteString )import qualified Data.ByteString.Internal as B ( toForeignPtr )payload :: B.ByteStringpayload = B.pack [  0x55,                   -- push   %rbp  0x48, 0x89, 0xe5,       -- mov    %rsp,%rbp  0x89, 0x7d, 0xec,       -- mov    %edi,-0x14(%rbp)  0xc7, 0x45, 0xfc,    1, 0, 0, 0,           -- movl   \$0x1,-0x4(%rbp)  0xeb, 0x0e,             -- jmp    1e <fact+0x1e>  0x8b, 0x45, 0xfc,       -- mov    -0x4(%rbp),%eax  0x0f, 0xaf, 0x45, 0xec, -- imul   -0x14(%rbp),%eax  0x89, 0x45, 0xfc,       -- mov    %eax,-0x4(%rbp)  0x83, 0x6d, 0xec, 1,    -- subl   \$0x1,-0x14(%rbp)  0x83, 0x7d, 0xec, 1,    -- cmpl   \$0x1,-0x14(%rbp)  0x7f, 0xec,             -- jg     10 <fact+0x10>  0x8b, 0x45, 0xfc,       -- mov    -0x4(%rbp),%eax  0xc9,                   -- leaveq   0xc3 ]                  -- retqtype F = Int -> Intforeign import ccall "dynamic"  getF :: FunPtr F -> Fmain :: IO ()main = do  [s] <- getArgs  n   <- readIO s  let (fptr, _, _) = B.toForeignPtr payload  withForeignPtr fptr \$ \ptr -> do    let f = getF \$ castPtrToFunPtr ptr    print \$ f n``

Testing it:

``````\$ runhaskell foo.hs 6
720
``````

This code will only work on `x86-64`.

## Tuesday, September 21, 2010

### crystalfontz 0.1

I've uploaded a Haskell library for talking to Crystalfontz LCD displays -- specifically, the one model that I own. If you have a different LCD and would like to poke it from Haskell, or you're dying to use one of the many unimplemented commands, let me know.

### jspath 0.1

I've uploaded the first version of jspath, a small library which traverses JSON structures to extract substructures matching a path.

### Higher-rank type constraints

I recently encountered a Haskell function which apparently requires "higher-rank" type constraints. I worked around the issue with some type trickery. I'd like to share the problem and the workaround, and especially get feedback on whether there's a better solution.

I did say type trickery:

``{-# LANGUAGE    GADTs  , FlexibleContexts  , RankNTypes  , ScopedTypeVariables #-}``

Suppose we're implementing an in-place sorting algorithm. We'll use the `ST` monad, so that we can expose a pure interface. Assuming that we only want to sort primitive types, we will use the unboxed `STUArray` type, which should improve space and time usage. Actually, it's totally unwarranted premature optimization for the sake of the problem. :)

``import Control.Monad.STimport Data.Array.ST``

The monadic part of this code is simple enough. I've left out the part where we actually, you know, sort the list. Consult your bedside copy of CLRS.

``sortM  :: forall a s.     (Ord a, MArray (STUArray s) a (ST s))  => [a]  -> ST s [a]sortM xs = do  arr <- newListArray (1, length xs) xs           :: ST s (STUArray s Int a)  -- do some in-place sorting here  getElems arr``

Note that `sortM` has a constraint `MArray (STUArray s) a (ST s)`. All we're really saying is that `a` is a type primitive enough that `STUArray` can unbox it. The rest of the constraint just connects the concrete world of `STUArray` with the generic `MArray` interface.

Note also that we give an explicit type on our call to `newListArray`. Otherwise, there's nothing to constrain what sort of array we get, and GHC will complain of ambiguity. Furthermore, we need to unify this `a` and `s` with the `a` and `s` from the signature on `sortM`. We do that using the `ScopedTypeVariables` extension and the explicit `forall`. (If you turn on `UnicodeSyntax` you can write it as `∀`; how cool is that?)

Good enough so far. Now, the point of using `ST` was to expose a pure interface, right?

``sortP_1 :: (Ord a) => [a] -> [a]sortP_1 xs = runST (sortM xs)``

Suddenly GHC is very unhappy:

``````    Could not deduce (MArray (STUArray s) a (ST s)) from the context ()
arising from a use of `sortM' at code.hs:21:20-27
Possible fix:
add (MArray (STUArray s) a (ST s)) to the context of
the polymorphic type `forall s. ST s a'
or add an instance declaration for (MArray (STUArray s) a (ST s))
In the first argument of `runST', namely `(sortM xs)'
In the expression: runST (sortM xs)
In the definition of `sortP_1': sortP_1 xs = runST (sortM xs)
``````

Okay, fair enough. We've given no reason for GHC to believe that `a` is an `STUArray`-compatible type. Let's copy over the constraint from `sortM`:

``sortP_2  :: (Ord a, MArray (STUArray s) a (ST s))  => [a] -> [a]sortP_2 xs = runST (sortM xs)``

Still no good:

``````    Could not deduce (MArray (STUArray s1) a (ST s1))
from the context ()
arising from a use of `sortM' at code.hs:27:20-27
Possible fix:
add (MArray (STUArray s1) a (ST s1)) to the context of
the polymorphic type `forall s. ST s a'
or add an instance declaration for (MArray (STUArray s1) a (ST s1))
In the first argument of `runST', namely `(sortM xs)'
In the expression: runST (sortM xs)
In the definition of `sortP_2': sortP_2 xs = runST (sortM xs)
``````

Note that GHC gives an error about the variable `s1`, not `s`. This is a valuable hint. Our caller (notionally) chooses one `s` when she provides evidence to satisfy the constraint `MArray (STUArray s) a (ST s)`. Then `runST` (notionally) gets to choose a different `s` because of its higher-rank type.

What to do? It's useless for our caller to pick some particular `s` and provide evidence. We need that evidence for all `s`:

``sortP_3  :: (Ord a, forall s. MArray (STUArray s) a (ST s))  => [a] -> [a]sortP_3 xs = runST (sortM xs)``

Unfortunately, GHC does not support this "higher-rank constraint":

``````code.hs:32:13: malformed class assertion
``````

Well... the `->` of functions and the `=>` of constraints are not as different as they may appear. A Haskell compiler can implement type classes by translating constraints to explicit "dictionary" arguments. Let's do the same thing ourselves:

``data Evidence s e where  Evidence    :: (MArray (STUArray s) e (ST s))    => Evidence s e``

We don't quite need to reproduce the `MArray` methods in our "dictionary". (That's good, because some of them are hidden. A strange class indeed.) It's enough to write a GADT constructor with this constrained type. The constructor will implicitly capture the constraint evidence (such as a dictionary) from its call-site.

Now we need to express the universal quantification on `s`:

``data ElemType e = ElemType (forall s. Evidence s e)``

A (non-⊥) value of type `ElemType e` is evidence that `e` is a suitable element type for `STUArray`. That's all we wanted the caller to say in the first place!

Now we can pass the appropriate evidence:

``sortP_4  :: forall a.     (Ord a)  => ElemType a  -> [a] -> [a]sortP_4 (ElemType Evidence) xs = runST (sortM xs)``

Not quite:

``````    Couldn't match expected type `forall s. Evidence s a'
against inferred type `Evidence s e'
In the pattern: Evidence
In the pattern: ElemType Evidence
In the definition of `sortP_4':
sortP_4 (ElemType Evidence) xs = runST (sortM xs)
``````

GHC still wants some coaxing to match up this `s` with the other `s`. We push the pattern-matching of `Evidence` inside `runST`:

``sortP_5  :: forall a.     (Ord a)  => ElemType a  -> [a] -> [a]sortP_5 (ElemType e) xs = runST (f e) where  f :: Evidence s a -> ST s [a]  f Evidence = sortM xs``

And...

``````[1 of 1] Compiling Main             ( code.hs, interpreted )
*Main>
``````

A glorious sight! Then, a moment of concern: is `sortP_5` actually usable? Does `ElemType` have non-⊥ values? Let's check:

``````*Main> sortP_5 (ElemType Evidence) ([3,4] :: [Int])
[3,4]
``````

Nice. It's sorted and everything. :) As expected, this works only for unboxable types:

``````*Main> sortP_5 (ElemType Evidence) ["foo","bar"]

<interactive>:1:18:
Could not deduce (MArray (STUArray s) [Char] (ST s))
from the context ()
arising from a use of `Evidence' at <interactive>:1:18-25
``````

As usual, this interesting exercise came from a question on the `#haskell` IRC channel. That's why it's one of my favorite places, even if they do specialize in making GHC cry.