Jump to content

Haskell/Understanding monads/State

From Wikibooks, open books for an open world
(Redirected from Haskell/Understanding monads/The State monad)

If you have programmed in any other language before, you likely wrote some functions that "kept state". For those new to the concept, a state is one or more variables that are required to perform some computation but are not among the arguments of the relevant function. Object-oriented languages like C++ make extensive use of state variables (in the form of member variables inside classes and objects). Procedural languages like C on the other hand typically use global variables declared outside the current scope or static variables in the functions to keep track of state.

In Haskell, however, such techniques are not as straightforward to apply. Doing so will require mutable variables which would mean that functions will have hidden dependencies, which is at odds with Haskell's functional purity. Fortunately, often it is possible to keep track of state in a functionally pure way. We do so by passing the state information from one function to the next, thus making the hidden dependencies explicit.

The State type is designed to simplify this process of threading state through functions. In this chapter, we will see how it can assist us in some typical problems involving state: modelling a state machine and generating pseudo-random numbers.

State Machine

[edit | edit source]

We will model a simple finite-state machine based on a coin-operated turnstile. Our model will be enhanced so that, in any state, it will create an output (in addition to a state transition) for each input[1].

Turnstile Example

[edit | edit source]

The finite-state model of our turnstile is shown in this state-transition diagram:

The turnstile has two states: Locked and Unlocked. (It starts in a Locked state). There are two types of input: Coin (corresponding to someone putting a coin in the slot) and Push (corresponding to someone pushing the arm). Each input causes an output (Thank, Open or Tut) and a transition to a (new, or maybe the same) state. If someone puts a coin in when the turnstile is locked, they are thanked (yes, it can talk!) and the turnstile becomes unlocked. If they add more coins, they are thanked more but get no benefit (the turnstile simply remains unlocked with no memory of how many additional coins have been added). If someone pushes the arm when the turnstile is unlocked, the arm will open to let them through, then become locked to prevent anyone else going through. If someone pushes the arm when the turnstile is locked, it will politely tut at them but not let them through and remain locked.

Basic Model in Haskell

[edit | edit source]

We will represent the states and outputs as follows:

data TurnstileState = Locked | Unlocked
  deriving (Eq, Show)

data TurnstileOutput = Thank | Open | Tut
  deriving (Eq, Show)

But what about the inputs? We can model them as functions. Here's a first attempt:

coin, push :: TurnstileState -> TurnstileOutput

coin _ = Thank

push Unlocked = Open
push Locked   = Tut

These correctly return the output for each input in any state, but don't give any indication of the new state. (In an imperative program, these "functions" might also update a variable to indicate the new state, but that is not an option in Haskell, nor, we claim, desirable). The answer is easy and obvious: return the new state along with the output:

coin, push :: TurnstileState -> (TurnstileOutput, TurnstileState)

coin _ = (Thank, Unlocked)

push Locked   = (Tut , Locked)
push Unlocked = (Open, Locked)

Sequencing Steps

[edit | edit source]

How can we use this? One way is to list the set of outputs resulting from a sequence of inputs:

monday :: TurnstileState -> ([TurnstileOutput], TurnstileState)
monday s0 =
  let (a1, s1) = coin s0
      (a2, s2) = push s1
      (a3, s3) = push s2
      (a4, s4) = coin s3
      (a5, s5) = push s4
  in ([a1, a2, a3, a4, a5], s5)

GHCi> monday Locked
([Thank,Open,Tut,Thank,Open],Locked)

Note that, like coin and push, this monday function takes an initial state as a parameter and returns the final state alongside the list of outputs.

Exercises
  1. Implement functions regularPerson, distractedPerson, hastyPerson :: TurnstileState -> ([TurnstileOutput], TurnstileState). Given a TurnstileState, each should return the outputs (and updated state) of a different type of person. A regularPerson always inserts a coin then pushes the arm. A distractedPerson always inserts a coin then wanders off without going through. A hastyPerson first pushes the arm without having inserted a coin. If it opens (for example, they're following a distractedPerson) they go through. If not (for example they're following a regularPerson) they'll then insert a coin and push the arm to go through.
  2. Use these functions to implement tuesday :: TurnstileState -> ([TurnstileOutput], TurnstileState), returning the outputs from this sequence of visitors: regularPerson, hastyPerson, distractedPerson, hastyPerson.
  3. Implement luckyPair :: Bool -> TurnstileState -> (Bool, TurnstileState) representing two people attempting to use the turnstile in succession. The first is either a regularPerson or a distractedPerson (depending on the Bool argument). The second person will simply push the arm without inserting a coin and give up if they don't get through. The Bool result should indicate whether the second person made it through.

From the examples it can be seen that:

  • the state (of some fixed type) is always passed from step to step, and (usually) included in the input to and output from a function: having the state as input to and output from functions allows us to chain them together as steps in bigger functions, in the same way the steps within these smaller functions are chained together;
  • the (non-state) output of a step may or may not be used in deciding subsequent steps: hastyPerson uses the output of the first push to determine whether they need to insert a coin, but regularPerson always does the same two steps regardless of their outputs;
  • the (non-state) output of a step may or may not be used in the final (non-state) return value from a function: the return value from regularPerson uses the output from each step, but the return value from luckyPair doesn't depend on the output from the first step;
  • a function may take parameters in addition to the initial state: luckyPair takes a Bool parameter;
  • the types of the (non-state) return values can be different for different functions: coin returns TurnstileOutput, monday returns [TurnstileOutput] and luckyPair returns Bool.

But all of this code is cumbersome, tedious to write and error prone. It would be ideal if we could automate the extraction of the second member of the tuple (i.e. the new state) and feed it to the next step, whilst also allowing the function to use the (non-state) values to make decisions about further steps and/or include in the (non-state) result(s). This is where State comes into the picture.

Introducing State

[edit | edit source]

The Haskell type State describes functions that consume a state and produce both a result and an updated state, which are given back in a tuple.

The state function is wrapped by a data type definition which comes along with a runState accessor so that pattern matching becomes unnecessary. For our current purposes, the State type might be defined as:

newtype State s a = State { runState :: s -> (a, s) }

Here, s is the type of the state, and a the type of the produced result. Calling the type State is arguably a bit of a misnomer because the wrapped value is not the state itself but a state processor.

newtype

[edit | edit source]

Note that we defined the data type with the newtype keyword, rather than the usual data. newtype can be used only for types with just one constructor and just one field. It ensures that the trivial wrapping and unwrapping of the single field is eliminated by the compiler. For that reason, simple wrapper types such as State are usually defined with newtype. Would defining a synonym with type be enough in such cases? Not really, because type does not allow us to define instances for the new data type, which is what we are about to do...

Where did the State constructor go?

[edit | edit source]

In the wild, the State type is provided by Control.Monad.Trans.State in transformers (and is also reexported by Control.Monad.State in mtl). When you use it, you will quickly notice there is no State constructor available. The transformers package implements the State type in a different way. The differences do not affect how we use or understand State, except that instead of a State constructor, Control.Monad.Trans.State exports a state function,

state :: (s -> (a, s)) -> State s a

which does the same job. As for why the actual implementation is not the obvious one we presented above, we will get back to that a few chapters down the road.

Note

In all of the code below, State is a (parameterised) type (that we will make instances of Monad, etc), and state is a function that takes an argument (of type s -> (a, s)) and returns a value of type State s a. If you are creating your own version of State by following this text as a tutorial (which I recommend), please add this code after the newtype declaration:

state :: (s -> (a, s)) -> State s a
state = State

This will ensure the code below is valid whether you use your own State implementation or the one provided.


Instantiating the Monad

[edit | edit source]

So far, all we have done was to wrap a function type and give it a name. There is another ingredient, however: for every type s, State s can be made a Monad instance, giving us very handy ways of using it.

To define a Monad instance, there must also be instances for Functor and Applicative. As we explained previously, these superclass instances can be derived as follows from the Monad instance that we are about to define in more detail.

import Control.Monad  -- you will need to put this towards the top of the file

instance Functor (State s) where
  fmap = liftM

instance Applicative (State s) where
  pure = return
  (<*>) = ap

In a later section we will discuss the implications of State also being a Functor and an Applicative in more detail. You will also get a chance to reimplement the above explicitly based on their behaviour, without simply deferring to the Monad instance.

So let's define this instance.

instance Monad (State s) where

Note the instance is State s, and not just State on its own; State can't be made an instance of Monad, as it takes two type parameters, rather than one. That means there are actually many different State monads, one for each possible type of state - State String, State Int, State SomeLargeDataStructure, and so forth. However, we only need to write one implementation of return and (>>=); the methods will be able to deal with all choices of s.

The return function is implemented as:

return :: a -> State s a
return x = state ( \ s -> (x, s) )

Giving a value (x) to return produces a function which takes a state (s) and returns it unchanged, together with the value we want to be returned. As a finishing step, the function is wrapped up with the state function.

Exercises
  1. Try to write the bind method before looking at the solution below.

As for binding, it can be defined like this:

(>>=) :: State s a -> (a -> State s b) -> State s b
p >>= k = q where
    p' = runState p        -- p' :: s -> (a, s)
    k' = runState . k      -- k' :: a -> s -> (b, s)
    q' s0 = (y, s2) where  -- q' :: s -> (b, s)
        (x, s1) = p' s0    -- (x, s1) :: (a, s)
        (y, s2) = k' x s1  -- (y, s2) :: (b, s)
    q = state q'

We wrote the definition above in a quite verbose way, to make the steps involved easier to pinpoint. A more compact way of writing it would be:

p >>= k = state $ \ s0 ->
   let (x, s1) = runState p s0  -- Running the first processor on s0.
   in runState (k x) s1         -- Running the second processor on s1.

(>>=) is given a state processor (p) and a function (k) that is used to create another processor from the result of the first one. The two processors are combined into a function that takes the initial state (s) and returns the second result and the third state (i.e. the output of the second processor). Overall, (>>=) here allows us to run two state processors in sequence, while allowing the result of the first stage to influence what happens in the second one.

Schematic representation of how bind creates a new state processor (pAB) from a state processor (pA) and a processor-making function (f). s1, s2 and s3 are states. v1 and v2 are values. pA, pB and pAB are state processors. The wrapping and unwrapping by State/runState is implicit.

One detail in the implementation is how runState is used to undo the State wrapping, so that we can reach the function that will be applied to the states. The type of runState p, for instance, is s -> (a, s).

Understanding the Bind Operator

[edit | edit source]

Another way to understand this derivation of the bind operator >>= is to consider once more the explicit but cumbersome way to simulate a stateful function of type a -> b by using functions of type (a, s) -> (b, s) , or, said another way: a -> s -> (b,s) = a -> (s -> (b,s)). These classes of functions pass the state on from function to function. Note that this last signature already suggests the right-hand side type in a bind operation where the abstract type is S b = (s -> (b, s)).

Now that we have seen how the types seem to suggest the monadic signatures, let's consider a much more concrete question: Given two functions f :: s -> (a, s) and g :: a -> s -> (b, s), how do we chain them to produce a new function that passes on the intermediate state?

This question does not require thinking about monads: one option is to simply use function composition. It helps our exposition if we just write it down explicitly as a lambda expression:

compose :: (s -> (a,s)) ->         {- first function -}
           (a -> (s -> (b,s))) ->  {- second function,  note type is similar to  (a,s) -> (b,s) -}
           s -> (b,s)              {- composed function -}
compose f g = \s0 -> let (a1, s1) = f s0 in (g a1) s1 
{-This lambda expression threads both intermediate results produced by f into those required by g -}

Now, if in addition to chaining the input functions, we find that the functions of signature s -> (a,s) were all wrapped in an abstract datatype Wrapped a, and that therefore we need to call some other provided functionswrap :: (s -> (a,s)) -> Wrapped a, and unwrap :: Wrapped a -> (s -> (a,s)) in order to get to the inner function, then the code changes slightly:

{- what happens if the type  s -> (a,s) is wrapped and this new type is  called Wrapped a -}
composeWrapped :: Wrapped a -> (a -> Wrapped b) -> Wrapped b
composeWrapped wrappedf g = wrap (\s0 -> let (a1,s1) = (unwrap wrappedf) s0 in (unwrap (g a1)) s1)

{- or, reusing compose -}
composeWrapped wrappedf g = wrap (compose (unwrap wrappedf) (fmap unwrap g))

This code is the implementation of (>>=) shown above, with wrap = state and unwrap = runState, so we can now see how the definition of bind given earlier is the standard function composition for this special kind of stateful function.

This explanation does not address yet where the original functions Wrapped a and a -> Wrapped b come from in the first place, but they do explain what you can do with them once you have them.

Turnstile using State

[edit | edit source]

We now look at how the State type can help with the turnstile example. Firstly, by comparing the type of coin :: TurnstileState -> (TurnstileOutput, TurnstileState) with newtype State s a = State { runState :: s -> (a, s) }, we can see that, by replacing s with TurnstileState and a by TurnstileOutput we can define:

coinS, pushS :: State TurnstileState TurnstileOutput
coinS = State coin
pushS = State push

Note

I've added S at the end of these names, just to distinguish them from those on this page that aren't based on the State monad. It's not something you'd normally do.


We can then use runState to extract the underlying functions and apply them to a state, for example:

GHCi> :t runState coinS
runState coinS :: TurnstileState -> (TurnstileOutput, TurnstileState)
GHCi> runState coinS Locked
(Thank,Unlocked)

Note

There is an interesting comparison here between runState and partial application of functions.

We normally consider, for example, take :: Int -> [a] -> [a] to be a function of two arguments. But we also saw (briefly, here) that we can apply it to only one argument to get a function of the one remaining argument, e.g. take 2 :: [a] -> [a]. We could then do e.g. map (take 2) ["every", "good", "boy"] to get ["ev", "go", "bo"]. In fact, Haskell always applies arguments one step at a time, so that take 2 "every" first applies the 2 to get a new function, to which it then applies "every". It could be written (take 2) "every".

runState is defined as a function that takes a single argument. And it returns a function that takes a single argument, to which we can then apply an initial state value. So runState coinS Locked means (runState coinS) Locked. But, as with (take 2) "every", the brackets are not needed.

In terms of application of arguments, take and runState are similar: they take one argument and return a function that takes another. The big difference between them is in their definitions. When take is defined it declares two parameters and uses both of them in the function definition. runState, however, is (implicitly) defined as a function of a single parameter. The function it returns is defined separately (by a user of State), and the type of State requires it to be a function of one parameter. Each of these implementations reference only their own parameters directly.


Using the Turnstile State monad

[edit | edit source]

Not yet too exciting, but now coinS and pushS are monadic (they are functions — admittedly of zero parameters — that return Monad instances) we can do monadic stuff with them, including using do notation:

mondayS :: State TurnstileState [TurnstileOutput]
mondayS = do
  a1 <- coinS
  a2 <- pushS
  a3 <- pushS
  a4 <- coinS
  a5 <- pushS
  return [a1, a2, a3, a4, a5]

GHCi> :t runState mondayS
runState mondayS :: TurnstileState -> ([TurnstileOutput], TurnstileState)
GHCi> runState mondayS Locked
([Thank,Open,Tut,Thank,Open],Locked)

Note that we're no longer writing all the code to thread the output state from each step into the next: the State monad is doing that for us. A lot of the tedious and error-prone work has been removed. How? Remember that do is simply syntactic sugar for the bind (>>=) operator so the above is equivalent to:

mondayS =
  coinS >>= (\ a1 ->
    pushS >>= (\ a2 ->
      pushS >>= (\ a3 ->
        coinS >>= (\ a4 ->
          pushS >>= (\ a5 ->
            return [a1, a2, a3, a4, a5] )))))

This uses the (>>=) operator we defined for State above, unwraps each state-processing function from its State wrapper, applies the output state from it as an argument into the next step, and wraps the result back in a State wrapper. The sequence of (>>=) operators, along with return combines all the steps into a single combined state-processing function wrapped in a State wrapper, which we can access and run with runState.

A monad is sometimes described as providing a value in a context. An IO monad can provide values from the real world when we ask it to. A Maybe monad can provide values if it's there, or not otherwise. What about the State monad? It can provide a value when we execute a step of a state-processor. (And the monad "automatically" ensures that state changes are passed from step to step without us having to worrying about it).

In this example, some tedium remains in obtaining the list of outputs from each step and combining them into a list. Can we do better? Yes we can:

mondayS :: State TurnstileState [TurnstileOutput]
mondayS = sequence [coinS, pushS, pushS, coinS, pushS]

We met sequence in the section on IO Monads. It creates a single action (in this case a state processing function) which, when executed, runs through each of the actions (in this case state processing steps) in turn, executing them and combining the results into a list.

Exercises
  1. Implement the functions regularPersonS, distractedPersonS, hastyPersonS :: State TurnstileState [TurnstileOutput] using sequence. You will still need do notation for the third.[2]
  2. Implement luckyPairS :: Bool -> State TurnstileState Bool

evalState and execState

[edit | edit source]

We have seen how runState accesses the state processing function so that we can do, for example, runState mondayS Locked. (We also used it in the definition of (>>=).)

Other functions which are used in similar ways are evalState and execState. Given a State a b and an initial state, the function evalState will give back only the result value of the state processing, whereas execState will give back just the new state.

evalState :: State s a -> s -> a
evalState p s = fst (runState p s)

execState :: State s a -> s -> s
execState p s = snd (runState p s)

OK, they're not much. But they're not nothing, and they allow us to do e.g.:

GHCi> evalState mondayS Locked
[Thank,Open,Tut,Thank,Open]

if we only want to see the output sequence, and not the final state.

Setting the State

[edit | edit source]

What if we had an turnstile engineer who wanted to test the locking mechanism with code like this:

testTurnstile :: State TurnstileState Bool
testTurnstile = do
  --somehow set state to Locked
  check1 <- pushS
  --somehow set state to Unlocked
  check2 <- pushS
  --somehow set state to Locked again
  return (check1 == Tut && check2 == Open)

This handy function comes to the rescue:

put :: s -> State s ()
put newState = state $ \_ -> ((), newState)

put is a monadic function that can be bound with (>>=) operators or fit in do constructs in sequence with other actions. It takes a state parameter (the one we want to introduce) and generates a state processor which ignores whatever state it receives and gives back the new state we introduced as the next state. Since we don't care about the result of this processor (all we want to do is to replace the state), the first element of the tuple will be (), the universal placeholder value.[3]

Let's see how it helps the engineer:

testTurnstile :: State TurnstileState Bool
testTurnstile = do
  put Locked
  check1 <- pushS
  put Unlocked
  check2 <- pushS
  put Locked
  return (check1 == Tut && check2 == Open)

GHCi> runState testTurnstile Locked
(True,Locked)
HHCi> runState testTurnstile Unlocked
(True,Locked)

Accessing the State

[edit | edit source]

In the definition of pushS above we made use of the existing code push. What if we wanted to write it without such pre-existing function? Obviously we could do this:

pushS = state $ \s -> case s of
  Locked   -> (Tut , Locked)
  Unlocked -> (Open, Locked)

but could we write it all using a do construct? Yes, using this:

get :: State s s
get = state $ \s -> (s, s)

get is also monadic and creates a state processor that gives back the state s it is given both as a result and as the next state. That means the state will remain unchanged, and that a copy of it will be made available for us to use.

We could use get like this:

pushS = do
  s <- get
  put Locked
  case s of
    Locked   -> return Tut
    Unlocked -> return Open
Exercises
  1. Rewrite coinS using a do construct with get and/or put.
  2. Extend testTurnstile so that it also checks the state is set to Unlocked after a coin is inserted, regardless of the state beforehand. And for good measure, have testTurnstile return the turnstile to it's original state when the testing is complete.
  3. Besides put and get, there are also
    modify :: (s -> s) -> State s ()
    which modifies the current state using a function, and
    gets :: (s -> a) -> State s a
    which produces a modified copy of the state while leaving the state itself unchanged. Write implementations for them.

Monadic Control Structures

[edit | edit source]

The second version of mondayS above shows another benefit of using the monad, in addition to the hiding of the state threading and ability to use do notation and the like: we are also able to use great functions like sequence. In this section we look at some more of these functions. (You will need to import Control.Monad, or do GHCi> :m Control.Monad to ensure all of these are in scope).

First, here's replicateM:

GHCi> evalState (replicateM 6 pushS) Unlocked
[Open,Tut,Tut,Tut,Tut,Tut]

Which is pretty self-explanatory.

Before we look at any more, we need a slightly different (arguably better) implementation of the turnstile finite-state machine, using an input type and a single processing function:

data TurnstileInput = Coin | Push
  deriving (Eq, Show)
  
turnS :: TurnstileInput -> State TurnstileState TurnstileOutput
turnS = state . turn where
  turn Coin _        = (Thank, Unlocked)
  turn Push Unlocked = (Open,  Locked)
  turn Push Locked   = (Tut,   Locked)

GHCi> runState (turnS Coin) Locked
(Thank,Unlocked)

We can now use mapM, like this:

GHCi> evalState (mapM turnS [Coin, Push, Push, Coin, Push]) Locked
[Thank,Open,Tut,Thank,Open]

This very nicely illustrates how the finite-state machine is a transducer: it converts an ordered sequence of inputs to an ordered sequence of outputs, maintaining the state as it goes along.

Now we'll look at filterM:

getsThroughS :: TurnstileInput -> State TurnstileState Bool
getsThroughS input = do
  output <- turnS input
  return $ output == Open

GHCi> evalState (filterM getsThroughS [Push, Coin, Coin, Push, Push, Coin, Push]) Locked
[Push,Push]

We can see two people made it through (not surprisingly, when they pushed the arm). If we switch the order of the first two inputs more people get through:

GHCi> evalState (filterM getsThroughS [Coin, Push, Coin, Push, Push, Coin, Push]) Locked
[Push,Push,Push]

Here's a different way of counting the number of openings using foldM:

countOpens :: [TurnstileInput] -> State TurnstileState Int
countOpens = foldM incIfOpens 0 where
  incIfOpens :: Int -> TurnstileInput -> State TurnstileState Int
  incIfOpens n i = do
    g <- getsThroughS i
    if g then return (n+1) else return n

GHCi> evalState (countOpens [Coin, Push, Coin, Push, Push, Coin, Push]) Locked
3

Note that sequence, mapM and filterM always execute all of the actions in the input list, but foldM could skip some.

Exercises
  1. Modify regularPersonS, distractedPersonS and hastyPersonS to use turnS and mapM.
  2. Implement tuesdayS using sequence or mapM.
  3. Implement saveCoinsS :: [TurnstileInput] -> State TurnstileState Int that potentially processes all of the given inputs, but will skip a Coin input if the previous input generated a Thanks. It returns the number of Coin inputs that were skipped. E.g. evalState (saveCoinsS [Push, Coin, Coin, Coin, Push, Push, Coin, Push]) Locked should give 2.
  4. Implement sequenceUntil :: (a -> Bool) -> [State s a] -> State s [a]. It processes each of the inputs until one of them generates a value that matches the predicate, then processes no more. E.g. evalState (sequenceUntil (== Open) [coinS, coinS, coinS, pushS, pushS, coinS]) Locked should give [Thank,Thank,Thank,Open]
  5. Modify sequenceUntil so that it works with any Monad instance.

Pseudo-Random Numbers

[edit | edit source]

Suppose we are coding a game in which at some point we need an element of chance. In real-life games that is often obtained by means of dice or similar. For a computer program we need something to emulate such an object, and most programming languages provide some concept of random numbers that can be used for this purpose[4].

Generating actual random numbers is hard. Computer programs almost always use pseudo-random numbers instead. They are "pseudo" because they are not actually random, and that they are known in advance. Indeed, they are generated by algorithms (the pseudo-random number generators) which take an initial state (commonly called the seed) and produce from it a sequence of numbers that have the appearance of being random.[5] Every time a pseudo-random number is requested, state somewhere must be updated, so that the generator can be ready for producing a fresh, different random number the next time. Sequences of pseudo-random numbers can be replicated exactly if the initial seed and the generating algorithm are known.

Haskell Global Pseudo-Random Number Generator

[edit | edit source]

Producing a pseudo-random number in most programming languages is very simple: there is a function somewhere in the libraries that provides a pseudo-random value (and also updates an internal mutable state so that it produces a different value next time, although some implementations perhaps produce a truly random value). Haskell has a similar one in the System.Random module from the random package:

GHCi> :m System.Random
GHCi> :t randomIO
randomIO :: Random a => IO a

What is Random? It's the class of types that can have pseudo-random values generated by the System.Random module functions. Int, Integer, Bool and others are all instances of Random. You can "request" any of these by specifying the result type:

GHCi> randomIO :: IO Int
-1557093684
GHCi> randomIO :: IO Int
1342278538
GHCi> randomIO :: IO Bool
True

More interestingly, randomIO is an IO action. It couldn't be otherwise, as it makes use of mutable state, which is kept out of reach from our Haskell programs. Thanks to this hidden dependency, the pseudo-random values it gives back can be different every time.

However, we're here to study the State monad, so let's look at functions that take and return an explicit representation of the random number generator state.

Haskell Pseudo-Random Number Generator with Explicit State

[edit | edit source]

Here's a slightly different function in the System.Random module:

GHCi> :t random
random :: (Random a, RandomGen g) => g -> (a, g)

Now there's no IO, and we should recognise the g -> (a, g) pattern as something we could put inside a State wrapper.

What is RandomGen? It is another class defined in the System.Random module. The module also provides a single instance StdGen. There are a couple of ways to create values of this type. The one we will use first is mkStdGen :: Int -> StdGen which creates a StdGen value from a given seed:

GHCi> mkStdGen 666
667 1
GHCi> mkStdGen 666
667 1

Note that, given the same seed, you get the same StdGen. What is StdGen? The documentation calls it "the standard pseudo-random number generator", but it might be better to call it the state of the standard pseudo-random number generator. We can see that here:

GHCi> let s = mkStdGen 666
GHCi> s
667 1
GHCi> random s :: (Int, StdGen)
(6438947685955577547,392509921 2103410263)

The first function (mkStdGen 666) returns an initial state, based on a given seed of 666. The second function (random s) takes the initial StdGen state and returns a pair: a random value (we've requested an Int) and a new StdGen state. How is the state represented internally? The System.Random module does it somehow, and we don't really care how. (We can see StdGen implements show, which displays two funny numbers. We could go and look at the source code if we really wanted to see how it works, but some clever person might go and change it one day anyway). How does random calculate a new state? We also don't care; we can just be happy that it does.

Example: Rolling Dice

[edit | edit source]
randomR (1,6)

We are going to build a dice-throwing example. And for this, we'll use a slightly different function:

GHCi> let s = mkStdGen 666
GHCi> randomR (1,6) s
(6,26689338 40692)

randomR takes a range (in this case 1 to 6) and returns a pseudo-random number in the range (we were lucky: we got a 6!).

Suppose we want a function that rolls two dice and returns a pair representing the result of each throw. Here's one way:

import System.Random --put this towards the top of the file

rollPair :: StdGen -> ((Int, Int), StdGen)
rollPair s0 =
  let (r1, s1) = randomR (1,6) s0
      (r2, s2) = randomR (1,6) s1
  in ((r1, r2), s2)

GHCi> rollPair (mkStdGen 666)
((6,1),647839921 1655838864)

Doesn't this remind us of the tedious and error-prone approach we first tried in the turnstile example? Not convinced it's tedious? Try the first exercise:

Exercises
  1. Implement rollSix :: StdGen -> ([Int], StdGen) using randomR (1,6) that returns a list representing the result of six consecutive throws.
  2. Implement rollN :: Int -> StdGen -> ([Int], StdGen). This is a bit tricky! But possible using iterate and take, for example.
  3. We're about to define rollDieS :: State StdGen Int. Why don't you have a go at it first, and contemplate what it is and how it could help.

Dice with State

[edit | edit source]

So, a better way, using State:

rollDieS :: State StdGen Int
rollDieS = state $ randomR (1,6)

GHCi> runState rollDieS (mkStdGen 666)
(6,26689338 40692)

This is very similar to the original versions of coinS and pushS: there was already a function of form s -> (a, s), and we just wrapped in in a State wrapper. Now we have monadic power! We can write:

rollPairS :: State StdGen (Int, Int)
rollPairS = do
  r1 <- rollDieS
  r2 <- rollDieS
  return (r1, r2)

GHCi> runState rollPairS (mkStdGen 666)
((6,1),647839921 1655838864)

And we avoid all the tedious threading of state from one step to the next.

Exercises
  1. Implement rollSixS :: State StdGen [Int] with the same behaviour as rollSix. Use do notation and rollDieS.
  2. Implement rollNS :: Int -> State StdGen [Int] using replicateM
  3. Implement luckyDoubleS :: State StdGen Int. It does a first throw. If it's a 6 it throws again and returns the total of the two throws, else it just returns the first throw.

State is also a Functor and an Applicative

[edit | edit source]

Here's another dice throwing function:

rollDieDoubledS :: State StdGen Int
rollDieDoubledS = do
  r <- rollDieS
  return (r * 2)

Its behaviour should be clear. But it seems a bit verbose for such a simple function. Can we do better?

As we noted previously (and saw above), State (and all other monads) are also instances of Functor and Applicative. And in the prologue we did:

    ...
    let mx = readMaybe s :: Maybe Double
    case fmap (2*) mx of
    ...

This leveraged the fact that Maybe is a Functor. The fmap (2*) mx converts a Just x to Just (2*x) (or Nothing to Nothing). If we think of x as a value wrapped in a context, we can see that the fmap has kept the same context (it's still a Just, or still Nothing), but applied a conversion to the wrapped value. We can do the same with the State functor:

rollDieDoubledS = fmap (*2) rollDieS

The meaning of State is different to the meaning of Maybe (it's the output of a state-processing step, not a possibly-existing value), but we've applied the same conversion to the wrapped value. Now, when we unwrap the value from rollDieDoubledS we get double what we would have got had we unwrapped rollDieS.

Suppose we also wanted rollTwoSummedS :: State StdGen Int? In the prologue section did sz <- (++) <$> getLine <*> getLine, and again we can do something similar:

rollTwoSummedS :: State StdGen Int
rollTwoSummedS = (+) <$> rollDieS <*> rollDieS

This code depends on State being also an Applicative, but not on it being a Monad. It will ensure each of the rollDieS actions is executed in order, and chain the state correctly between them. It will then repackage the combination as a state-processing function wrapped in a State wrapper. The combined function will return the sum of the two successive throws (and also, but quietly, ensure the state is added as an input parameter and an output value).

The Control.Applicative module provides a function liftA2:

liftA2 f u v = f <$> u <*> v

Using this, it rollTwoSummedS could be defined as:

import Control.Applicative --this needs to be at the top of your file

rollTwoSummedS = liftA2 (+) rollDieS rollDieS
Exercises
  1. Rewrite rollPairS using (<$>) and (<*>), or liftA2.
  2. Implement happyDoubleS :: State StdGen Int, which throws two dice and returns the sum of the first and the second, but doubles the total if the first is a six. Code it using do notation.
  3. Rewrite happyDoubleS using (<$>) and (<*>), or liftA2.
  4. Can you recode luckyDoubleS using just (<$>) and (<*>) (or liftA2)? Why not? Can you use (<$>) (or fmap) to make it a bit shorter?
  5. Modify tuesdayS, saveCoinS and sequenceUntil from the Monadic Control Structures exercises using fmap and/or (<$>).
  6. In a previous section we defined State instances of Functor and Applicative by deferring to the Monad instance. Now try to write them explicitly based on their behaviour, without using the Monad instance.

More is said later on the relationship between Functor, Applicative and Monad, and choosing which one to use.

Pseudo-random values of different types

[edit | edit source]

We saw that randomIO :: Random a => IO a can return a value of a type other than Int. So can its IO-free equivalent random :: (Random a, RandomGen g) => g -> (a, g).

Because State StdGen is "agnostic" in regard to the type of the pseudo-random value it produces, we can write a similarly "agnostic" function that provides a pseudo-random value of unspecified type (as long as it is an instance of Random):

getRandomS :: Random a => State StdGen a
getRandomS = state random

Compared to rollDieS, this function does not specify the Int type in its signature and uses random instead of randomR; otherwise, it is just the same. getRandomS can be used for any instance of Random:

GHCi> evalState getRandomS (mkStdGen 0) :: Bool
True
GHCi> evalState getRandomS (mkStdGen 0) :: Char
'\64685'
GHCi> evalState getRandomS (mkStdGen 0) :: Double
0.9872770354820595
GHCi> evalState getRandomS (mkStdGen 0) :: Integer
2092838931

Indeed, it becomes quite easy to conjure all these at once:

someTypes :: State StdGen (Int, Float, Char)
someTypes = liftA3 (,,) getRandomS getRandomS getRandomS

allTypes :: State StdGen (Int, Float, Char, Integer, Double, Bool, Int)
allTypes = (,,,,,,) <$> getRandomS
                    <*> getRandomS
                    <*> getRandomS
                    <*> getRandomS
                    <*> getRandomS
                    <*> getRandomS
                    <*> getRandomS

For writing allTypes, there is no liftA7,[6] and so we resort to plain old (<*>) instead. Using it, we can apply the tuple constructor to each of the seven random values in the State StdGen monadic context.

allTypes provides pseudo-random values for all default instances of Random; an additional Int is inserted at the end to prove that the generator is not the same, as the two Ints will be different.

GHCi> evalState allTypes (mkStdGen 0)
GHCi>(2092838931,9.953678e-4,'\825586',-868192881,0.4188001483955421,False,316817438)

(Probably) don't put or get

[edit | edit source]

In the turnstile example, we used put to set the state and get to access it. Can we do the same here? Well, yes we can, but there's probably no need.

In that example, we had to code functions (like pushS) that used the current state to determine outputs and new states, or (like testTurnstile) that set required states as part of a processing sequence. With our random number examples, all generation, inspection and update of the StdGen state is done internally within the System.Random module, without us having to know how.

Now, in our first implementation of rollPair we were aware of the StdGen state: we took it as a parameter, threaded it through the successive steps and returned the final state. If we really wanted to use the value (perhaps we wanted to put it in a debugging message using trace) then we did have the opportunity. And, with our State monad we still do. The following shows one usage:

rollDieS :: State StdGen Int
rollDieS = do s0 <- get
              let (value, s1) = randomR (1,6) s0
              put s1
              return value

This does spell out all of the steps the State monad takes for us, but would be a rather perverse implementation since the whole point of the monad is so we don't have to spell them out.

Exercises
  1. Write randomElt :: [a] -> State StdGen a, using put and get to access the StdGen state. It can assume the list is non empty, and should return a random element from within it.
  2. Rewrite randomElt without using put or get.

Better Random Numbers

[edit | edit source]

Other than our initial use of randomIO, all of the above examples have used mkStdGen, and all with the same seed value 666. This would make for a pretty boring game, where exactly the same dice were rolled each time. (Though this might be useful, e.g. when testing your program.) How can we get better random numbers? Like this:

getRandomPair :: IO (Int, Int)
getRandomPair = do
  s <- newStdGen
  return $ evalState rollPairS s

newStdGen is (effectively) defined as newStdGen :: IO StdGen. It is an IO action that spawns a new random state from the same global random state used by randomIO. It also updates that global state, so that further uses of newStdGen give a different value.

So, aren't we back a square one, being dependent on IO? No we're not. We have gained all the power of the State monad to build up chains of dice-rolling steps which we can assemble into bigger and bigger state-transformation functions. We can do all of that without IO. In the turnstile example, we didn't need IO at all (although we probably would if we wanted to put our code into some kind of application), and for some uses of random numbers, having the same numbers each time might be beneficial. We only needed IO to get "really random" numbers, and we may well need newStdGen only once in a program. Chances are that it would be alongside other IO actions, for example:

main :: IO ()
main = do
  s <- newStdGen
  let (r1, r2) = evalState rollPairS s
  putStrLn $ "You rolled twice and got " ++ show r1 ++ " and " ++ show r2 ++ "."

Handling Combined States

[edit | edit source]

Suppose we wanted to create a random turnstile, where each visitor would be given a random turnstile input: either they insert a coin (but are not allowed through); or they get to push the arm (and go through if it opens, but are otherwise sent away).

Here's one useful bit of code:

randomInputS :: State StdGen TurnstileInput
randomInputS = do
  b <- getRandomS
  return $ if b then Coin else Push

This allows us to generate random turnstileInput values[7]. However, our random turnstile machine needs to track both the state of a random number generator and the state of the turnstile. We want to write a function like this:

randomTurnS :: State (StdGen, TurnstileState) TurnstileOutput

And this function needs to call both randomInputS (which is in the State StdGen monad) and turnS (which is in the State TurnstileState monad).

Exercises
  1. Implement randomTurnS, using get and put to access and set the combined state, and runState to invoke randomInputS and turnS.

Much of the code in randomTurnS deals with managing the state: accessing the combined state, unpacking subcomponents, forwarding them to the individual State monads, recombining them and putting the combined state back. The state management code is not too bad in this case, but could easily become cumbersome in a more complex function. And it is something we wanted the State monad to hide from us.

State-Processing a Subcomponent

[edit | edit source]

Ideally we'd want some utility function(s) that allow us to invoke a State StdGen monad function (or State TurnstileState monad function) from within a State (StdGen, TurnstileState) monad function. These function(s) should take care of the state management for us, ensuring that the right subcomponent of the combined state is updated.

Here's one such a function that works for any combined state represented as a pair, and performs the state update on the first state of the pair:

processingFst :: State a o -> State (a,b) o
processingFst m = do
  (s1,s2) <- get
  let (o,s1') = runState m s1
  put (s1',s2)
  return o

Note the type:

GHCi> :t processingFst randomInputS
processingFst randomInputS :: State (StdGen, b) TurnstileInput

processingFst "converts" a State monad (in this case with state type StdGen) to another State monad (in this case with state type (StdGen, b), where b can be any type, even a TurstileState).

Exercises
  1. Implement processingSnd.
  2. Modify randomTurnS to use processingFst and processingSnd.

Note how randomTurnS is no longer directly involved in the details of the state management, and its business logic is much more apparent.

Generic Subcomponent Processing

[edit | edit source]

We can see that processingFst and processingSnd are very similar. They both extract a subcomponent of a combined state, runState on that subcomponent, then update the combined state with the new value of the subcomponent.

Let's combine them into a single generic subcomponent processing function. To do this, we could pass two functions in separate parameters, one of type (cmb -> sub) (a function that extracts a subcomponent from a combined state value), and another of type (cmb -> sub -> cmb) (a function that, given a combined value and a new value for a subcomponent, returns the revised combined value with the updated subcomponent). However, it's a bit neater to package these two functions together in a type which we'll call Lens:

data Lens cmb sub = Lens
  { view :: cmb -> sub,
    set  :: cmb -> sub -> cmb
  }

We can provide specific lenses onto the fst and snd elements in a pair:

fstL :: Lens (a,b) a
fstL = Lens fst (\(_,y) x -> (x,y))

sndL :: Lens (a,b) b
sndL = Lens snd (\(x,_) y -> (x,y))

So now:

GHCi> view fstL ("fred", 5)
"fred"
GHCi> set fstL ("fred", 5) "sue"
("sue",5)

Note

Lenses that are more sophisticated and powerful are described later, but it's also harder to understand how they work. Our simple lenses are sufficient for now, but you might want to update the random turnstile code to use "proper lenses" later on.


We can now replace processingFst and processingSnd with our generic function.

Exercises
  1. Implement processing that takes a Lens cmb sub parameter, and "converts" a State sub to a State cmb.
  2. Rewrite randomTurnS using the processing function (and fstL and sndL).

Our final random turnstile code is neater, with three separate logical functions segregated:

  • state management (now in a single processing utility function, which can be reused elsewhere);
  • subcomponent accessing and update (using Lens, which can also be reused elsewhere[8].); and
  • the "business logic" of the turnstile, which is now very apparent.

In our first implementation, all three of these were muddled together.

Let's give it a go:

GHCi> g <- newStdGen
GHCi> evalState (replicateM 10 randomTurnS) (g, Locked)
[Thank,Open,Tut,Thank,Thank,Open,Tut,Tut,Tut,Thank]

I'm not sure we'll sell many of them, though.

Notes

  1. Hence our finite-state machine is a transducer.
  2. This comparison of Applicative and Monad explains why you can't use just sequence for hastyPersonS. In summary, it's because the actions taken (and the number of values in the result list) depend on the outcome of the first action (the initial attempt to push the arm), whereas for the first two always execute two actions and return the corresponding two results.
  3. The technical term for both () and its type is unit.
  4. Random numbers can also be used for many other things, for example simulation, statistical analysis and cryptography
  5. A common source of seeds is the current date and time as given by the internal clock of the computer. Assuming the clock is functioning correctly, it can provide unique seeds suitable for most day-to-day needs (as opposed to applications which demand high-quality randomness, as in cryptography or statistics).
  6. Beyond liftA3, the standard libraries only provide the monad-only liftM4 and liftM5 in Control.Monad.
  7. Alternatively, we could make TurnstileInput an instance of Uniform, but this code seems easier.
  8. We could use Control.Lens for this as described in the later chapter. This module also provides more lenses, automatic creation of lenses for custom data types, easy combining of lenses for deeply-nested subcomponents, etc. Also, Control.Lens.Combinators includes a zoom function that is more generic than processing.