Write Yourself a Scheme in 48 Hours/Evaluation, Part 2
Additional Primitives: Partial Application
[edit | edit source]Now that we can deal with type errors, bad arguments, and so on, we'll flesh out our primitive list so that it does something more than calculate. We'll add boolean operators, conditionals, and some basic string operations.
Start by adding the following into the list of primitives:
("=", numBoolBinop (==)),
("<", numBoolBinop (<)),
(">", numBoolBinop (>)),
("/=", numBoolBinop (/=)),
(">=", numBoolBinop (>=)),
("<=", numBoolBinop (<=)),
("&&", boolBoolBinop (&&)),
("||", boolBoolBinop (||)),
("string=?", strBoolBinop (==)),
("string<?", strBoolBinop (<)),
("string>?", strBoolBinop (>)),
("string<=?", strBoolBinop (<=)),
("string>=?", strBoolBinop (>=)),
These depend on helper functions that we haven't written yet: numBoolBinop
, boolBoolBinop
and strBoolBinop
. Instead of taking a variable number of arguments and returning an integer, these both take exactly two arguments and return a boolean. They differ from each other only in the type of argument they expect, so let's factor the duplication into a generic boolBinop
function that's parametrized by the unpacker function it applies to its arguments:
boolBinop :: (LispVal -> ThrowsError a) -> (a -> a -> Bool) -> [LispVal] -> ThrowsError LispVal
boolBinop unpacker op args = if length args /= 2
then throwError $ NumArgs 2 args
else do left <- unpacker $ args !! 0
right <- unpacker $ args !! 1
return $ Bool $ left `op` right
Because each argument may throw a type mismatch, we have to unpack them sequentially, in a do-block (for the Error
monad). We then apply the operation to the two arguments and wrap the result in the Bool
constructor. Any function can be turned into an infix operator by wrapping it in backticks (`op`
).
Also, take a look at the type signature. boolBinop
takes two functions as its first two arguments: the first is used to unpack the arguments from LispVal
s to native Haskell types, and the second is the actual operation to perform. By parameterizing different parts of the behavior, you make the function more reusable.
Now we define three functions that specialize boolBinop
with different unpackers:
numBoolBinop = boolBinop unpackNum
strBoolBinop = boolBinop unpackStr
boolBoolBinop = boolBinop unpackBool
We haven't told Haskell how to unpack strings from LispVal
s yet. This works similarly to unpackNum
, pattern matching against the value and either returning it or throwing an error. Again, if passed a primitive value that could be interpreted as a string (such as a number or boolean), it will silently convert it to the string representation.
unpackStr :: LispVal -> ThrowsError String
unpackStr (String s) = return s
unpackStr (Number s) = return $ show s
unpackStr (Bool s) = return $ show s
unpackStr notString = throwError $ TypeMismatch "string" notString
And we use similar code to unpack booleans:
unpackBool :: LispVal -> ThrowsError Bool
unpackBool (Bool b) = return b
unpackBool notBool = throwError $ TypeMismatch "boolean" notBool
Let's compile and test this to make sure it's working, before we proceed to the next feature:
$ ghc -package parsec -o simple_parser [../code/listing6.1.hs listing6.1.hs] $ ./simple_parser "(< 2 3)" #t $ ./simple_parser "(> 2 3)" #f $ ./simple_parser "(>= 3 3)" #t $ ./simple_parser "(string=? \"test\" \"test\")" #t $ ./simple_parser "(string<? \"abc\" \"bba\")" #t
Conditionals: Pattern Matching 2
[edit | edit source]Now, we'll proceed to adding an if-clause to our evaluator. As with standard Scheme, our evaluator considers #f
to be false and any other value to be true:
eval (List [Atom "if", pred, conseq, alt]) =
do result <- eval pred
case result of
Bool False -> eval alt
otherwise -> eval conseq
As the function definitions are evaluated in order, be sure to place this one above eval (List (Atom func : args)) = mapM eval args >>= apply func
or it will throw a Unrecognized primitive function args: "if"
error.
This is another example of nested pattern-matching. Here, we're looking for a 4-element list. The first element must be the atom if
. The others can be any Scheme forms. We take the first element, evaluate, and if it's false, evaluate the alternative. Otherwise, we evaluate the consequent.
Compile and run this, and you'll be able to play around with conditionals:
$ ghc -package parsec -o simple_parser [../code/listing6.2.hs listing6.2.hs] $ ./simple_parser "(if (> 2 3) \"no\" \"yes\")" "yes" $ ./simple_parser "(if (= 3 3) (+ 2 3 (- 5 1)) \"unequal\")" 9
List Primitives: car
, cdr
, and cons
[edit | edit source]For good measure, let's also add in the basic list-handling primitives. Because we've chosen to represent our lists as Haskell algebraic data types instead of pairs, these are somewhat more complicated than their definitions in many Lisps. It's easiest to think of them in terms of their effect on printed S-expressions:
(car '(a b c)) = a
(car '(a)) = a
(car '(a b . c)) = a
(car 'a) = error
– not a list(car 'a 'b) = error
–car
only takes one argument
We can translate these fairly straightforwardly into pattern clauses, recalling that (x : xs)
divides a list into the first element and the rest:
car :: [LispVal] -> ThrowsError LispVal
car [List (x : xs)] = return x
car [DottedList (x : xs) _] = return x
car [badArg] = throwError $ TypeMismatch "pair" badArg
car badArgList = throwError $ NumArgs 1 badArgList
Let's do the same with cdr:
(cdr '(a b c)) = (b c)
(cdr '(a b)) = (b)
(cdr '(a)) = NIL
(cdr '(a . b)) = b
(cdr '(a b . c)) = (b . c)
(cdr 'a) = error
– not a list(cdr 'a 'b) = error
– too many arguments
We can represent the first three cases with a single clause. Our parser represents '()
as List []
, and when you pattern-match (x : xs)
against [x]
, xs
is bound to []
. The other ones translate to separate clauses:
cdr :: [LispVal] -> ThrowsError LispVal
cdr [List (x : xs)] = return $ List xs
cdr [DottedList [_] x] = return x
cdr [DottedList (_ : xs) x] = return $ DottedList xs x
cdr [badArg] = throwError $ TypeMismatch "pair" badArg
cdr badArgList = throwError $ NumArgs 1 badArgList
cons
is a little tricky, enough that we should go through each clause case-by-case. If you cons together anything with Nil
, you end up with a one-item list, the Nil
serving as a terminator:
cons :: [LispVal] -> ThrowsError LispVal
cons [x1, List []] = return $ List [x1]
If you cons
together anything and a list, it's like tacking that anything onto the front of the list:
cons [x, List xs] = return $ List $ x : xs
However, if the list is a DottedList
, then it should stay a DottedList
, taking into account the improper tail:
cons [x, DottedList xs xlast] = return $ DottedList (x : xs) xlast
If you cons
together two non-lists, or put a list in front, you get a DottedList
. This is because such a cons
cell isn't terminated by the normal Nil
that most lists are.
cons [x1, x2] = return $ DottedList [x1] x2
Finally, attempting to cons
together more or less than two arguments is an error:
cons badArgList = throwError $ NumArgs 2 badArgList
Our last step is to implement eqv?
. Scheme offers three levels of equivalence predicates: eq?
, eqv?
, and equal?
. For our purposes, eq?
and eqv?
are basically the same: they recognize two items as the same if they print the same, and are fairly slow. So we can write one function for both of them and register it under eq?
and eqv?
.
eqv :: [LispVal] -> ThrowsError LispVal
eqv [(Bool arg1), (Bool arg2)] = return $ Bool $ arg1 == arg2
eqv [(Number arg1), (Number arg2)] = return $ Bool $ arg1 == arg2
eqv [(String arg1), (String arg2)] = return $ Bool $ arg1 == arg2
eqv [(Atom arg1), (Atom arg2)] = return $ Bool $ arg1 == arg2
eqv [(DottedList xs x), (DottedList ys y)] = eqv [List $ xs ++ [x], List $ ys ++ [y]]
eqv [(List arg1), (List arg2)] = return $ Bool $ (length arg1 == length arg2) &&
(all eqvPair $ zip arg1 arg2)
where eqvPair (x1, x2) = case eqv [x1, x2] of
Left err -> False
Right (Bool val) -> val
eqv [_, _] = return $ Bool False
eqv badArgList = throwError $ NumArgs 2 badArgList
Most of these clauses are self-explanatory, the exception being the one for two Lists
. This, after checking to make sure the lists are the same length, zip
s the two lists of pairs, and then uses the function all
to return False
if eqvPair
returns False
on any of the pairs. eqvPair
is an example of a local definition: it is defined using the where
keyword, just like a normal function, but is available only within that particular clause of eqv
. Since we know that eqv
only throws an error if the number of arguments is not 2, the line Left err -> False
will never be executed at the moment.
equal?
and Weak Typing: Heterogenous Lists
[edit | edit source]Since we introduced weak typing above, we'd also like to introduce an equal?
function that ignores differences in the type tags and only tests if two values can be interpreted the same. For example, (eqv? 2 "2") = #f
, yet we'd like (equal? 2 "2") = #t
. Basically, we want to try all of our unpack functions, and if any of them result in Haskell values that are equal, return True
.
The obvious way to approach this is to store the unpacking functions in a list and use mapM
to execute them in turn. Unfortunately, this doesn't work, because standard Haskell only lets you put objects in a list if they're the same type. The various unpacker functions return different types, so you can't store them in the same list.
We'll get around this by using a GHC extension – Existential Types – that lets us create a heterogenous list, subject to typeclass constraints. Extensions are fairly common in the Haskell world: they're basically necessary to create any reasonably large program, and they're often compatible between implementations (existential types work in both Hugs and GHC and are a candidate for standardization). Note you need to use a special compiler flag for this: -fglasgow-exts
as mentioned below; the newer -XExistentialQuantification
; or add the pragma {-# LANGUAGE ExistentialQuantification #-}
to the beginning of your code (In general, the compiler flag -Xfoo
can be replaced by the pragma {-# LANGUAGE foo #-}
inside the source file).
The first thing we need to do is define a data type that can hold any function from a LispVal -> something
, provided that that something
supports equality:
data Unpacker = forall a. Eq a => AnyUnpacker (LispVal -> ThrowsError a)
This is like any normal algebraic datatype, except for the type constraint. It says, "For any type that is an instance of Eq
, you can define an Unpacker
that takes a function from LispVal
to that type, and may throw an error". We'll have to wrap our functions with the AnyUnpacker
constructor, but then we can create a list of Unpacker
s that does just what we want it.
Rather than jump straight to the equal?
function, let's first define a helper function that takes an Unpacker
and then determines if two LispVal
s are equal when it unpacks them:
unpackEquals :: LispVal -> LispVal -> Unpacker -> ThrowsError Bool
unpackEquals arg1 arg2 (AnyUnpacker unpacker) =
do unpacked1 <- unpacker arg1
unpacked2 <- unpacker arg2
return $ unpacked1 == unpacked2
`catchError` (const $ return False)
After pattern-matching to retrieve the actual function, we enter a do-block for the ThrowsError
monad. This retrieves the Haskell values of the two LispVal
s, and then tests whether they're equal. If there is an error anywhere within the two unpackers, it returns False
, using the const
function because catchError
expects a function to apply to the error value.
Finally, we can define equal?
in terms of these helpers:
equal :: [LispVal] -> ThrowsError LispVal
equal [arg1, arg2] = do
primitiveEquals <- liftM or $ mapM (unpackEquals arg1 arg2)
[AnyUnpacker unpackNum, AnyUnpacker unpackStr, AnyUnpacker unpackBool]
eqvEquals <- eqv [arg1, arg2]
return $ Bool $ (primitiveEquals || let (Bool x) = eqvEquals in x)
equal badArgList = throwError $ NumArgs 2 badArgList
The first action makes a heterogenous list of [unpackNum, unpackStr, unpackBool]
, and then maps the partially applied (unpackEquals arg1 arg2)
over it. This gives a list of booleans, so we use the Prelude function or
to return true if any single one of them is true.
The second action tests the two arguments with eqv?
. Since we want equal?
to be looser than eqv?
, it should return true whenever eqv?
does so. This also lets us avoid handling cases like the list or dotted-list (though this introduces a bug; see exercise #2 in this section).
Finally, equal?
or
s both of these values together and wraps the result in the Bool
constructor, returning a LispVal
. The let (Bool x) = eqvEquals in x
is a quick way of extracting a value from an algebraic type: it pattern matches Bool x
against the eqvEquals
value, and then returns x
. The result of a let-expression is the expression following the keyword in
.
To use these functions, insert them into our primitives list:
("car", car),
("cdr", cdr),
("cons", cons),
("eq?", eqv),
("eqv?", eqv),
("equal?", equal)]
To compile this code, you need to enable GHC extensions with -fglasgow-exts
:
$ ghc -package parsec -fglasgow-exts -o parser [../code/listing6.4.hs listing6.4.hs] $ ./parser "(cdr '(a simple test))" (simple test) $ ./parser "(car (cdr '(a simple test)))" simple $ ./parser "(car '((this is) a test))" (this is) $ ./parser "(cons '(this is) 'test)" ((this is) . test) $ ./parser "(cons '(this is) '())" ((this is)) $ ./parser "(eqv? 1 3)" #f $ ./parser "(eqv? 3 3)" #t $ ./parser "(eqv? 'atom 'atom)" #t
Exercises |
---|
|