Jump to content

Haskell/Alternative and MonadPlus

From Wikibooks, open books for an open world
(Redirected from Haskell/MonadPlus)

In our studies so far, we saw that both Maybe and lists can represent computations with a varying number of results. We use Maybe to indicate a computation can fail somehow (that is, it can have either zero results or one result), and we use lists for computations that can have many possible results (ranging from zero to arbitrarily many results). In both of these cases, one useful operation is amalgamating all possible results from multiple computations into a single computation. With lists, for instance, that would amount to concatenating lists of possible results. The Alternative class captures this amalgamation in a general way.

Definition

[edit | edit source]

Note

The Alternative class and its methods can be found in the Control.Applicative module.


Alternative is a subclass of Applicative whose instances must define, at a minimum, the following two methods:

class Applicative f => Alternative f where
  empty :: f a
  (<|>) :: f a -> f a -> f a

empty is an applicative computation with zero results, while (<|>) is a binary function which combines two computations.

Here are the two instance definitions for Maybe and lists:

instance Alternative Maybe where
  empty               = Nothing
  -- Note that this could have been written more compactly.
  Nothing <|> Nothing = Nothing -- 0 results + 0 results = 0 results
  Just x  <|> Nothing = Just x  -- 1 result  + 0 results = 1 result
  Nothing <|> Just x  = Just x  -- 0 results + 1 result  = 1 result
  Just x  <|> Just y  = Just x  -- 1 result  + 1 result  = 1 result:
                                -- Maybe can only hold up to one result,
                                -- so we discard the second one.

instance Alternative [] where
  empty = []
  (<|>) = (++) -- length xs + length ys = length (xs ++ ys)

Example: parallel parsing

[edit | edit source]

Traditional input parsing involves functions which consume an input one character at a time. That is, a parsing function takes an input string and chops off (i.e. "consumes") characters from the front if they satisfy certain criteria. For example, you could write a function which consumes one uppercase character. If the characters on the front of the string don't satisfy the given criteria, the parser has failed. In the example below, for instance, we consume a digit in the input and return the digit that was parsed. The possibility of failure is expressed by using Maybe.

digit i (c:_)
  | i > 9 || i < 0 = Nothing
  | otherwise      =
    if [c] == show i then Just i else Nothing

The guards assure that the Int we are checking for is a single digit. Otherwise, we are just checking that the first character of our String matches the digit we are checking for. If it passes, we return the digit wrapped in a Just. Otherwise we return Nothing.

Now, (<|>) can be used to run two parsers in parallel. That is, we use the result of the first one if it succeeds, and otherwise, we use the result of the second. If both fail, then the combined parser returns Nothing. We can use digit with (<|>) to, for instance, parse strings of binary digits:

binChar :: String -> Maybe Int
binChar s = digit 0 s <|> digit 1 s

Parser libraries often make use of Alternative in this way. Two examples are (+++) in Text.ParserCombinators.ReadP and (<|>) in Text.ParserCombinators.Parsec.Prim. This usage pattern can be described in terms of choice. For instance, if we want to give binChar a string that will be successfully parsed, we have two choices: either to begin the string with '0' or with '1'.

MonadPlus

[edit | edit source]

The MonadPlus class is closely related to Alternative:

class Monad m => MonadPlus m where
  mzero :: m a
  mplus :: m a -> m a -> m a

Its definition is the same as Alternative, except for different method names and the Applicative constraint being changed into Monad. Unsurprisingly, for types that have instances of both Alternative and MonadPlus, mzero and mplus should be equivalent to empty and (<|>) respectively.

One might legitimately wonder why the seemingly redundant MonadPlus class exists. Part of the reason is historical: just like Monad existed in Haskell long before Applicative was introduced, MonadPlus is much older than Alternative. Beyond such accidents, there are additional expectations (ones that do not apply to Alternative) about how the MonadPlus methods should interact with the Monad, and therefore indicating that something is a MonadPlus is a stronger claim than indicating that it is both an Alternative and a Monad. We will make some additional considerations about this issue in the following section.

Alternative and MonadPlus laws

[edit | edit source]

Like most general-purpose classes, Alternative and MonadPlus are expected to follow a handful of laws. However, there isn't universal agreement on what the full set of laws should look like. The most commonly adopted laws, and the most crucial for providing intuition about Alternative, say that empty and (<|>) form a monoid. By that, we mean:

-- empty is a neutral element
empty <|> u  =  u
u <|> empty  =  u
-- (<|>) is associative
u <|> (v <|> w)  =  (u <|> v) <|> w

There is nothing fancy about "forming a monoid": in the above, "neutral element" and "associative" here is just like how addition of integer numbers is said to be associative and to have zero as its neutral element. In fact, this analogy is the source of the names of the MonadPlus methods, mzero and mplus.

As for MonadPlus, at a minimum there usually are the monoid laws, which correspond exactly to the ones just above...

mzero `mplus` m  =  m
m `mplus` mzero  =  m
m `mplus` (n `mplus` o)  =  (m `mplus` n) `mplus` o

... plus the additional two laws, quoted by the Control.Monad documentation:

mzero >>= f  =  mzero -- left zero
m >> mzero   =  mzero -- right zero

If mzero is interpreted as a failed computation, these laws state that a failure within a chain of monadic computations leads to the failure of the whole chain.

We will touch upon some additional suggestions of laws for Alternative and MonadPlus at the end of the chapter.

Useful functions

[edit | edit source]

In addition to (<|>) and empty, there are two other general-purpose functions in the base libraries involving Alternative.

A common task when working with Alternative is taking a list of alternative values, e.g. [Maybe a] or [[a]], and folding it down with (<|>). The function asum, from Data.Foldable fulfills this role:

asum :: (Alternative f, Foldable t) => t (f a) -> f a
asum = foldr (<|>) empty

In a sense, asum generalizes the list-specific concat operation. Indeed, the two are equivalent when lists are the Alternative being used. For Maybe, asum finds the first Just x in the list and returns Nothing if there aren't any.

It should also be mentioned that msum, available from both `Data.Foldable` and `Control.Monad`, is just asum specialised to MonadPlus.

msum :: (MonadPlus m, Foldable t) => t (m a) -> m a

guard

[edit | edit source]

When discussing the list monad we noted how similar it was to list comprehensions, but we didn't discuss how to mirror list comprehension filtering. The guard function from Control.Monad allows us to do exactly that.

Consider the following comprehension which retrieves all pythagorean triples (i.e. trios of integer numbers which work as the lengths of the sides for a right triangle). First we'll examine the brute-force approach. We'll use a boolean condition for filtering; namely, Pythagoras' theorem:

pythags = [ (x, y, z) | z <- [1..], x <- [1..z], y <- [x..z], x^2 + y^2 == z^2 ]

The translation of the comprehension above to a list monad do-block is:

pythags = do
  z <- [1..]
  x <- [1..z]
  y <- [x..z]
  guard (x^2 + y^2 == z^2)
  return (x, y, z)

The guard function can be defined for all Alternatives like this:

guard :: Alternative m => Bool -> m ()
guard True  = pure ()
guard _ = empty

guard will reduce a do-block to empty if its predicate is False. Given the left zero law...

mzero >>= f = mzero
-- Or, equivalently:
empty >>= f = empty

... an empty on the left-hand side of an >>= operation will produce empty again. As do-blocks are decomposed to lots of expressions joined up by (>>=), an empty at any point will cause the entire do-block to become empty.

Let's examine in detail what guard does in the pythags. First, here is guard defined for the list monad:

-- guard :: Bool -> [()]
guard True  = [()]
guard _ = []

Basically, guard blocks off a route. In pythags, we want to block off all the routes (or combinations of x, y and z) where x^2 + y^2 == z^2 is False. Let's look at the expansion of the above do-block to see how it works:

pythags =
  [1..] >>= \z ->
  [1..z] >>= \x ->
  [x..z] >>= \y ->
  guard (x^2 + y^2 == z^2) >>= \_ ->
  return (x, y, z)

Replacing >>= and return with their definitions for the list monad (and using some let-bindings to keep it readable), we obtain:

pythags =
 let ret x y z = [(x, y, z)]
     gd  z x y = concatMap (\_ -> ret x y z) (guard $ x^2 + y^2 == z^2)
     doY z x   = concatMap (gd  z x) [x..z]
     doX z     = concatMap (doY z  ) [1..z]
     doZ       = concatMap (doX    ) [1..]
 in doZ

Remember that guard returns the empty list in the case of its argument being False. Mapping across the empty list produces the empty list, no matter what function you pass in. So an empty list produced by the call to guard in gd will cause gd to produce an empty list, with \_ -> ret x y z, which would otherwise add a result, not being actually called.

To understand why this matters, think about list-computations as a tree. With our Pythagorean triple algorithm, we need a branch starting from the top for every choice of z, then a branch from each of these branches for every value of x, then from each of these, a branch for every value of y. So the tree looks like this:

   start
   |_________________________...
   |    |         |
z  1    2         3
   |    |____     |____________
   |    |    |    |       |    |
x  1    1    2    1       2    3
   |    |_   |    |___    |_   |
   |    | |  |    | | |   | |  |
y  1    1 2  2    1 2 3   2 3  3

Each combination of z, x and y represents a route through the tree. Once all the functions have been applied, the results of each branch are concatenated together, starting from the bottom. Any route where our predicate doesn't hold evaluates to an empty list, and so has no impact on this concatenation.

Exercises

[edit | edit source]
Exercises
  1. Prove the Alternative monoid laws for Maybe and lists.
  2. We could augment the parser from the parallel parsing example so that it would handle any character, in the following manner:
    -- | Consume a given character in the input, and return 
    --   the character we just consumed, paired with rest of 
    --   the string. We use a do-block  so that if the
    --   pattern match fails at any point, 'fail' of the
    --   Maybe monad (i.e. Nothing) is returned.
    char :: Char -> String -> Maybe (Char, String)
    char c s = do
      c' : s' <- return s
      guard (c == c')
      return (c, s')
    
    It would then be possible to write a hexChar function which parses any valid hexadecimal character (0-9 or a-f). Try writing this function (hint: map digit [0..9] :: [String -> Maybe Int]).
  3. Use guard and the Applicative combinators (pure, (<*>), (*>), etc.) to implement safeLog from the Maybe monad chapter. Do not use the Monad combinators (return, (>>=), (>>), etc.).

Relationship with monoids

[edit | edit source]

While discussing the Alternative laws above, we alluded to the mathematical concept of monoids. There is in fact already a Monoid class in Haskell (defined in Data.Monoid). A thorough presentation of monoid will be given in a later chapter. However for now it suffices to say that a minimal definition of Monoid implements two methods; namely, a neutral element (or 'zero') and an associative binary operation (or 'plus').

class Monoid m where 
  mempty  :: m
  mappend :: m -> m -> m

For example, lists form a simple monoid:

instance Monoid [a] where
  mempty  = []
  mappend = (++)

Looks familiar, doesn't it? In spite of the uncanny resemblance to Alternative and MonadPlus, there is a key difference. Note the use of [a] instead of [] in the instance declaration. Monoids are not necessarily "wrappers" of anything, or parametrically polymorphic. For instance, the integer numbers form a monoid under addition with 0 as neutral element. Alternative is a separate type class because it captures a specific sort of monoid with distinctive properties − for instance, a binary operation (<|>) :: Alternative f => f a -> f a -> f a that is intrinsically linked to an Applicative context.

Other suggested laws

[edit | edit source]

Note

Consider this as a bonus section. While it is good to be aware of there being various takes on these laws, the whole issue is, generally speaking, not worth losing sleep over.


Beyond the commonly assumed laws mentioned a few sections above, there are a handful of others which make sense from certain perspectives, but do not hold for all existing instances of Alternative and MonadPlus. The current MonadPlus, in particular, might be seen as an intersection between a handful of hypothetical classes that would have additional laws.

The following two additional laws are commonly suggested for Alternative. While they do hold for both Maybe and lists, there are counterexamples in the core libraries. Also note that, for Alternatives that are also MonadPlus, the mzero laws mentioned earlier are not a consequence of these laws.

(f <|> g) <*> a = (f <*> a) <|> (g <*> a) -- right distributivity (of <*>)
empty <*> a = empty -- right absorption (for <*>)

As for MonadPlus, a common suggestion is the left distribution law, which holds for lists, but not for Maybe:

(m `mplus` n) >>= k  =  (m >>= k) `mplus` (n >>= k) -- left distribution

Conversely, the left catch law holds for Maybe but not for lists:

return x `mplus` m = return x -- left catch

It is generally assumed that either left distribution or left catch will hold for any MonadPlus instance. Why not both? Suppose they both hold. Then for any x, y :: m a,

x `mplus` y
= -- monad identity
(return x >>= id) `mplus` (return y >>= id)
= -- left distribution
(return x `mplus` return y) >>= id
= -- left catch
return x >>= id
= -- monad identity
x

This immediately rules out all but the most trivial MonadPlus implementation. Even worse, it implies that for any x, mzero `mplus` x = mzero. Adding the monoid identity law mzero `mplus` x = x then implies that the monad has only one value, and is thus isomorphic to the trivial monad Data.Proxy.Proxy.

Finally, it is worth noting that there are divergences even about the monoid laws. One case sometimes raised against them is that for certain non-determinism monads typically expressed in terms of MonadPlus the key laws are left zero and left distribution, while the monoid laws in such cases lead to difficulties and should be relaxed or dropped entirely.

Some entirely optional further reading, for the curious reader: