Yet Another Haskell Tutorial/Io
As we mentioned earlier, it is difficult to think of a good, clean way
to integrate operations like input/output into a pure functional
language. Before we give the solution, let's take a step back and
think about the difficulties inherent in such a task.
Any IO library should provide a host of functions, containing (at a minimum) operations like:
- print a string to the screen
- read a string from a keyboard
- write data to a file
- read data from a file
There are two issues here. Let's first consider the initial two
examples and think about what their types should be. Certainly the
first operation (I hesitate to call it a "function") should take a
String
argument and produce something, but what should it
produce? It could produce a unit ()
, since there is essentially
no return value from printing a string. The second operation,
similarly, should return a String
, but it doesn't seem to require
an argument.
We want both of these operations to be functions, but they are by
definition not functions. The item that reads a string from the
keyboard cannot be a function, as it will not return the same
String
every time. And if the first function simply returns
()
every time, there should be no problem with replacing it with
a function f _ = ()
, due to referential transparency. But
clearly this does not have the desired effect.
The RealWorld Solution
[edit | edit source]In a sense, the reason that these items are not functions is that they
interact with the "real world." Their values depend directly on the
real world. Supposing we had a type RealWorld
, we might write
these functions as having type:
printAString :: RealWorld -> String -> RealWorld readAString :: RealWorld -> (RealWorld, String)
That is, printAString
takes a current state of the world and a
string to print; it then modifies the state of the world in such a way
that the string is now printed and returns this new value. Similarly,
readAString
takes a current state of the world and returns a
new state of the world, paired with the String
that was
typed.
This would be a possible way to do IO, though it is more than somewhat
unwieldy. In this style (assuming an initial RealWorld
state
were an argument to main
), our "Name.hs" program from
the section on Interactivity would look something like:
main rW = let rW' = printAString rW "Please enter your name: " (rW'',name) = readAString rW' in printAString rW'' ("Hello, " ++ name ++ ", how are you?")
This is not only hard to read, but prone to error, if you accidentally
use the wrong version of the RealWorld
. It also doesn't model the
fact that the program below makes no sense:
main rW = let rW' = printAString rW "Please enter your name: " (rW'',name) = readAString rW' in printAString rW' -- OOPS! ("Hello, " ++ name ++ ", how are you?")
In this program, the reference to rW''
on the last line has been
changed to a reference to rW'
. It is completely unclear what
this program should do. Clearly, it must read a string in order to
have a value for name
to be printed. But that means that the
RealWorld
has been updated. However, then we try to ignore this
update by using an "old version" of the RealWorld
. There is
clearly something wrong happening here.
Suffice it to say that doing IO operations in a pure lazy functional language is not trivial.
Actions
[edit | edit source]The breakthrough for solving this problem came when Phil Wadler realized that monads would be a good way to think about IO computations. In fact, monads are able to express much more than just the simple operations described above; we can use them to express a variety of constructions like concurrence, exceptions, IO, non-determinism and much more. Moreover, there is nothing special about them; they can be defined within Haskell with no special handling from the compiler (though compilers often choose to optimize monadic operations).
As pointed out before, we cannot think of things like "print a string
to the screen" or "read data from a file" as functions, since they
are not (in the pure mathematical sense). Therefore, we give them
another name: actions. Not only do we give them a special
name, we give them a special type. One particularly useful action is
putStrLn
, which prints a string to the screen. This action has
type:
putStrLn :: String -> IO ()
As expected, putStrLn
takes a string argument. What it returns
is of type IO ()
. This means that this function is actually an
action (that is what the IO
means). Furthermore, when this
action is evaluated (or "run") , the result will have type
()
.
Note
Actually, this type means that putStrLn
is an action within the IO monad, but we will gloss over this for now.
You can probably already guess the type of getLine
:
getLine :: IO String
This means that getLine
is an IO action that, when run, will
have type String
.
The question immediately arises: "how do you `run' an action?".
This is something that is left up to the compiler. You cannot
actually run an action yourself; instead, a program is, itself, a
single action that is run when the compiled program is executed.
Thus, the compiler requires that the main
function have type
IO ()
, which means that it is an IO action that returns nothing.
The compiled code then executes this action.
However, while you are not allowed to run actions yourself, you
are allowed to combine
actions. In fact, we have already
seen one way to do this using the do notation (how to
really do this will be revealed in the chapter Monads).
Let's consider the original name program:
main = do hSetBuffering stdin LineBuffering putStrLn "Please enter your name: " name <- getLine putStrLn ("Hello, " ++ name ++ ", how are you?")
We can consider the do notation as a way to combine a sequence of
actions. Moreover, the <-
notation is a way to get the value out
of an action. So, in this program, we're sequencing four actions:
setting buffering, a putStrLn
, a getLine
and another
putStrLn
. The putStrLn
action has type String -> IO ()
,
so we provide it a String
, so the fully applied action has type
IO ()
. This is something that we are allowed to execute.
The getLine
action has type IO String
, so it is okay to
execute it directly. However, in order to get the value out of the
action, we write name <- getLine
, which basically means "run
getLine
, and put the results in the variable called name
."
Normal Haskell constructions like if/then/else and case/of can be used within the do notation, but you need to be somewhat careful. For instance, in our "guess the number" program, we have:
do ... if (read guess) < num then do putStrLn "Too low!" doGuessing num else if read guess > num then do putStrLn "Too high!" doGuessing num else do putStrLn "You Win!"
If we think about how the if/then/else construction works, it
essentially takes three arguments: the condition, the "then" branch,
and the "else" branch. The condition needs to have type Bool
,
and the two branches can have any type, provided that they have the
same type. The type of the entire if/then/else
construction is then the type of the two branches.
In the outermost comparison, we have (read guess) < num
as the
condition. This clearly has the correct type. Let's just consider
the "then" branch. The code here is:
do putStrLn "Too low!" doGuessing num
Here, we are sequencing two actions: putStrLn
and
doGuessing
. The first has type IO ()
, which is fine. The
second also has type IO ()
, which is fine. The type result of the
entire computation is precisely the type of the final computation.
Thus, the type of the "then" branch is also IO ()
. A similar
argument shows that the type of the "else" branch is also
IO ()
. This means the type of the entire if/then/else
construction is IO ()
, which is just what we want.
Note
In this code, the last line is else do putStrLn "You Win!"
.
This is somewhat overly verbose. In fact, else putStrLn "You
Win!"
would have been sufficient, since do is only necessary
to sequence actions. Since we have only one action here, it is
superfluous.
It is incorrect to think to yourself "Well, I already started a do block; I don't need another one," and hence write something like:
do if (read guess) < num then putStrLn "Too low!" doGuessing num else ...
Here, since we didn't repeat the do, the compiler doesn't know
that the putStrLn
and doGuessing
calls are supposed to be
sequenced, and the compiler will think you're trying to call
putStrLn
with three arguments: the string, the function
doGuessing
and the integer num
. It will certainly complain
(though the error may be somewhat difficult to comprehend at this
point).
We can write the same doGuessing
function using a case
statement. To do this, we first introduce the Prelude function
compare
, which takes two values of the same type (in the Ord
class) and returns one of GT
, LT
, EQ
, depending on
whether the first is greater than, less than or equal to the second.
doGuessing num = do putStrLn "Enter your guess:" guess <- getLine case compare (read guess) num of LT -> do putStrLn "Too low!" doGuessing num GT -> do putStrLn "Too high!" doGuessing num EQ -> putStrLn "You Win!"
Here, again, the dos after the ->
s are necessary on the
first two options, because we are sequencing actions.
If you're used to programming in an imperative language like C or
Java, you might think that return will exit you from the current
function. This is not so in Haskell. In Haskell, return simply
takes a normal value (for instance, one of type Int
) and makes
it into an action that returns the given value (for instance,
the value of type IO Int
). In particular, in an imperative
language, you might write this function as:
void doGuessing(int num) { print "Enter your guess:"; int guess = atoi(readLine()); if (guess == num) { print "You win!"; return (); } // we won't get here if guess == num if (guess < num) { print "Too low!"; doGuessing(num); } else { print "Too high!"; doGuessing(num); } }
Here, because we have the return ()
in the first if
match,
we expect the code to exit there (and in most imperative languages, it
does). However, the equivalent code in Haskell, which might look
something like:
doGuessing num = do putStrLn "Enter your guess:" guess <- getLine case compare (read guess) num of EQ -> do putStrLn "You win!" return () -- we don't expect to get here unless guess == num if (read guess < num) then do putStrLn "Too low!"; doGuessing num else do putStrLn "Too high!"; doGuessing num
will not behave as you expect. First of all, if you guess
correctly, it will first print "You win!," but it won't exit, and it
will check whether guess
is less than num
. Of course it is
not, so the else branch is taken, and it will print "Too high!" and
then ask you to guess again.
On the other hand, if you guess incorrectly, it will try to evaluate
the case statement and get either LT
or GT
as the result of
the compare
. In either case, it won't have a pattern that
matches, and the program will fail immediately with an exception.
Exercises |
---|
Write a program that asks the user for his or her name. If the name is one of Simon, John or Phil, tell the user that you think Haskell is a great programming language. If the name is Koen, tell them that you think debugging Haskell is fun (Koen Classen is one of the people who works on Haskell debugging); otherwise, tell the user that you don't know who he or she is. Write two different versions of this program, one using if statements, the other using a case statement. |
The IO Library
[edit | edit source]The IO Library (available by importing the System.IO
module)
contains many definitions, the most common of which are listed below:
data IOMode = ReadMode | WriteMode | AppendMode | ReadWriteMode openFile :: FilePath -> IOMode -> IO Handle hClose :: Handle -> IO () hIsEOF :: Handle -> IO Bool hGetChar :: Handle -> IO Char hGetLine :: Handle -> IO String hGetContents :: Handle -> IO String getChar :: IO Char getLine :: IO String getContents :: IO String hPutChar :: Handle -> Char -> IO () hPutStr :: Handle -> String -> IO () hPutStrLn :: Handle -> String -> IO () putChar :: Char -> IO () putStr :: String -> IO () putStrLn :: String -> IO () readFile :: FilePath -> IO String writeFile :: FilePath -> String -> IO () bracket :: IO a -> (a -> IO b) -> (a -> IO c) -> IO c
Note
The type FilePath
is a type synonym for String
. That
is, there is no difference between FilePath
and String
. So,
for instance, the readFile
function takes a String
(the file
to read) and returns an action that, when run, produces the contents
of that file. See the section on Synonyms for more about
type synonyms.
Most of these functions are self-explanatory. The openFile
and
hClose
functions open and close a file, respectively, using the
IOMode
argument as the mode for opening the file. hIsEOF
tests for end-of file. hGetChar
and hGetLine
read a
character or line (respectively) from a file. hGetContents
reads
the entire file. The getChar
, getLine
and
getContents
variants read from standard input. hPutChar
prints a character to a file; hPutStr
prints a string; and
hPutStrLn
prints a string with a newline character at the end.
The variants without the h
prefix work on standard output. The
readFile
and writeFile
functions read an entire file without
having to open it first.
The bracket
function is used to perform actions safely. Consider
a function that opens a file, writes a character to it, and then
closes the file. When writing such a function, one needs to be
careful to ensure that, if there were an error at some point, the file
is still successfully closed. The bracket
function makes this
easy. It takes three arguments: The first is the action to perform
at the beginning. The second is the action to perform at the end,
regardless of whether there's an error or not. The third is the
action to perform in the middle, which might result in an error. For
instance, our character-writing function might look like:
writeChar :: FilePath -> Char -> IO () writeChar fp c = bracket (openFile fp ReadMode) hClose (\h -> hPutChar h c)
This will open the file, write the character and then close the file.
However, if writing the character fails, hClose
will still be
executed, and the exception will be reraised afterwards. That way,
you don't need to worry too much about catching the exceptions and
about closing all of your handles.
A File Reading Program
[edit | edit source]We can write a simple program that allows a user to read and write files. The interface is admittedly poor, and it does not catch all errors (try reading a non-existent file). Nevertheless, it should give a fairly complete example of how to use IO. Enter the following code into "FileRead.hs," and compile/run:
module Main where import System.IO import Control.Exception main = do hSetBuffering stdin LineBuffering doLoop doLoop = do putStrLn "Enter a command rFN wFN or q to quit:" command <- getLine case command of 'q':_ -> return () 'r':filename -> do putStrLn ("Reading " ++ filename) doRead filename doLoop 'w':filename -> do putStrLn ("Writing " ++ filename) doWrite filename doLoop _ -> doLoop doRead filename = bracket (openFile filename ReadMode) hClose (\h -> do contents <- hGetContents h putStrLn "The first 100 chars:" putStrLn (take 100 contents)) doWrite filename = do putStrLn "Enter text to go into the file:" contents <- getLine bracket (openFile filename WriteMode) hClose (\h -> hPutStrLn h contents)
What does this program do? First, it issues a short string of instructions and reads a command. It then performs a case switch on the command and checks first to see if the first character is a `q.' If it is, it returns a value of unit type.
Note
The return
function is a function that takes a value of type
a
and returns an action of type IO a
. Thus, the type of
return ()
is IO ()
.
If the first character of the command wasn't a `q,' the program checks
to see if it was an 'r' followed by some string that is bound to the
variable filename
. It then tells you that it's reading the file,
does the read and runs doLoop
again. The check for `w' is
nearly identical. Otherwise, it matches _
, the wildcard
character, and loops to doLoop
.
The doRead
function uses the bracket
function to make sure
there are no problems reading the file. It opens a file in
ReadMode
, reads its contents and prints the first 100 characters
(the take
function takes an integer and a list and returns
the first elements of the list).
The doWrite
function asks for some text, reads it from the
keyboard, and then writes it to the file specified.
Note
Both doRead
and doWrite
could have been made simpler by
using readFile
and writeFile
, but they were written in the
extended fashion to show how the more complex functions are used.
The only major problem with this program is that it will die if you
try to read a file that doesn't already exists or if you specify some
bad filename like *\^\#_@
. You may think that the calls
to bracket
in doRead
and doWrite
should take care of
this, but they don't. They only catch exceptions within the main
body, not within the startup or shutdown functions (openFile
and
hClose
, in these cases). We would need to catch exceptions raised
by openFile
, in order to make this complete. We will do this
when we talk about exceptions in more detail in
the section on Exceptions.
Exercises |
---|
Write a program that first asks whether the user wants to read from a file, write to a file or quit. If the user responds quit, the program should exit. If he responds read, the program should ask him for a file name and print that file to the screen (if the file doesn't exist, the program may crash). If he responds write, it should ask him for a file name and then ask him for text to write to the file, with "." signaling completion. All but the "." should be written to the file. For example, running this program might produce: Example: Do you want to [read] a file, [write] a file or [quit]? read Enter a file name to read: foo ...contents of foo... Do you want to [read] a file, [write] a file or [quit]? write Enter a file name to write: foo Enter text (dot on a line by itself to end): this is some text for foo . Do you want to [read] a file, [write] a file or [quit]? read Enter a file name to read: foo this is some text for foo Do you want to [read] a file, [write] a file or [quit]? read Enter a file name to read: foof Sorry, that file does not exist. Do you want to [read] a file, [write] a file or [quit]? blech I don't understand the command blech. Do you want to [read] a file, [write] a file or [quit]? quit Goodbye! |