Sunday, September 26, 2010

View patterns for validation

GHC's ViewPatterns extension has a lot of unexpected uses. One that I've found recently is input validation.

{-# LANGUAGE ViewPatterns #-}

import Control.Monad
import Text.Printf

months :: [String]
months = words "Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec"

Using this function:

range :: (Ord a) => a -> a -> a -> a
range lb ub x
| (x < lb) || (x > ub) = error "argument out of range"
| otherwise = x

we can put bounds on arguments:

month :: Int -> String
month (range 1 12 -> m) = months !! (m-1)

like so:

GHCi> month 3
"Mar"
GHCi> month 15
"*** Exception: argument out of range

This version provides better error messages:

-- V for 'verbose'
rangeV :: (Ord a, Show a) => String -> a -> a -> a -> a
rangeV fun lb ub x
| (x < lb) || (x > ub) = error (printf msg fun (show x) (show lb) (show ub))
| otherwise = x
where msg = "%s: argument %s is outside of range [%s,%s]"

like so:

monthV :: Int -> String
monthV (rangeV "monthV" 1 12 -> m) = months !! (m-1)

GHCi> monthV 3 "Mar" GHCi> monthV 15 "*** Exception: monthV: argument 15 is outside of range [1,12]

Or handle failure monadically:

-- Mp for MonadPlus
rangeMp :: (MonadPlus m, Ord a) => a -> a -> a -> m a
rangeMp lb ub x = guard ((x >= lb) && (x <= ub)) >> return x

like so:

monthMaybe :: Int -> Maybe String
monthMaybe (rangeMp 1 12 -> Just m) = Just (months !! (m-1))
monthMaybe _ = Nothing

GHCi> monthMaybe 3 Just "Mar" GHCi> monthMaybe 15 Nothing

4 comments:

  1. Yeah, that's nice. Personally I prefer the latter and would prefer it with Either so that you get actual information about why it failed.

    > {-# LANGUAGE ViewPatterns #-}

    Supposing we use the Control.Applicative.Error class.

    > import Control.Applicative.Error

    And we define a function that ensures X is true:

    > ensure :: String -> (a -> Bool) -> a -> Failing a
    > ensure err p a
    > | p a = Success a
    > | otherwise = Failure [err]

    We can then define inRange in terms of it for Failing:

    > inRange :: (Show a,Ord a) => a -> a -> a -> Failing a
    > inRange lb ub x = ensure err (\a -> a>=lb && a<=ub) x where
    > err = "Must be in range >=" ++ show lb ++ " and <= " ++ show ub

    And our `month' and `day' functions are defined like this:

    > months :: [String]
    > months = words "Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec"

    > month :: Int -> Failing String
    > month (inRange 1 12 -> Success m) = Success $ months !! (m-1)
    > month _ = Failure ["Month must be in range 1-12."]

    > days :: [String]
    > days = words "Monday Tuesday Wednesday Thursday Friday Saturday Sunday"

    > day :: Int -> Failing String
    > day (inRange 1 7 -> Success d) = Success $ days !! (d-1)
    > day _ = Failure ["Day must be in range 1-7."]

    The nice thing is we get this for free:

    λ> (,) <$> month 1 <*> day 3
    Success ("Jan","Wednesday")

    λ> (,) <$> month 1 <*> day 12
    Failure ["Day must be in range 1-7."]

    λ> (,) <$> month 24 <*> day 12
    Failure ["Month must be in range 1-12.","Day must be in range 1-7."]

    It kind of sucks to be building strings all the time, it'd be
    nicer to have proper types to represent these invariants, e.g.

    > data Prop a = OutOfRange a | Empty a | WrongSize a | InvalidFormat a | Ok a

    Or something like that. I suppose throwing exceptions is faster
    than creating intermediate data structures.

    ReplyDelete
  2. Nice idea! But I would like to see listed in your post the alternative without view patterns:

    month :: Int -> String
    month i | inRange 1 12 i = months !! (i-1)
    | otherwise = error "..."

    inRange :: Ord a => a -> a -> a -> Bool
    inRange lb ub x = x >= lb && x <= ub

    (the blog will probably clobber indentation)

    ReplyDelete
  3. Thanks for the suggestions Chris, and for mentioning Control.Applicative.Error. It definitely includes a few things I reimplemented in a recent project; I'll want to switch over.

    ReplyDelete
  4. This is one of the things I love about (GHC) Haskell: you can often simply create a clever new "pseudo-syntax" out of the blue!!

    This "range checking pattern" would be rigid syntax in most other languages. Here it's just something that emerges from the programming paradigm.

    ReplyDelete