Posted on: 25/03/2018
A GitChapter Introduction to Monad Transformers
or, Values as Exceptions
A big thank you to Christoffer Stjernlöf https://two-wrongs.com/ who provided his work under (Creative Commons Share Alike license), as well as the other contributers, this article was originally posted at https://github.com/kqr/gists/blob/master/articles/gentle-introduction-monad-transformers.md.
This project is written with https://github.com/chrissound/GitChapter, this means you are able to clone down the project from https://github.com/chrissound/GentleIntroductionToMonadTransformers, and hack along from each section where you see the following:
Git From Commit:
f47adfa0e7a25f13ed073599983c578773f6fc75
Git Until Commit:
538140635ec8be16a7be0aae85c1916bced79fa4
The rewrite using GitChapter is partially completed. Hoping to finish this eventually!
Either Left or Right
Before we break into the mysterious world of monad transformers, I want to start with reviewing a much simpler concept, namely the Either
data type. If you aren’t familiar with the Either
type, you should probably not jump straight into monad transformers – they do require you to be somewhat comfortable with the most common monads.
With that out of the way:
Pretend we have a function that extracts the domain from an email address. Actually checking this properly is a rather complex topic which I will avoid, and instead I will assume an email address can only contain one @
symbol, and everything after it is the domain.
I’m going to work with Text
values rather than String
s. This means if you don’t have the text
library, you can either work with String
s instead, or cabal install text
. If you have the Haskell platform, you have the text
library.
We need to import Data.Text
and set the OverloadedStrings
pragma. The latter lets us write string literals (such as "Hello, world!"
) and have them become Text
values automatically.
λ> :module +Data.Text
λ> :set -XOverloadedStrings
Now, figuring out how many @
symbols there are in an email address is fairly simple. We can see that
ghci> :set -XOverloadedStrings
ghci> :module +Data.Text
ghci> splitOn "@" ""
[""]
ghci> splitOn "@" "test"
["test"]
ghci> splitOn "@" "test@example.com"
["test","example.com"]
ghci> splitOn "@" "test@example@com"
["test","example","com"]
So if the split gives us just two elements back, we know the address contains just one @
symbol, and we also as a bonus know that the second element of the list is the domain we wanted. We can put this in a file.
{-# LANGUAGE OverloadedStrings #-}
import Data.Text
-- Imports that will be needed later:
import qualified Data.Text.IO as T
import Data.Map as Map
import Control.Applicative
data LoginError = InvalidEmail
deriving Show
getDomain :: Text -> Either LoginError Text
getDomain email =
case splitOn "@" email of
[name, domain] -> Right domain
_ -> Left InvalidEmail
This draws on our previous discoveries and is pretty self-explainatory. The function returns Right domain
if the address is valid, otherwise Left InvalidEmail
, a custom error type we use to make handling the errors easier later on. (Why this is called LoginError
will be apparent soon.)
This function behaves as we expect it to.
ghci> :load src/Main.hs
ghci> getDomain "test@example.com"
Right "example.com"
ghci> getDomain "invalid.email@example@com"
Left InvalidEmail
To deal with the result of this function immediately, we have a couple of alternatives. The basic tool to deal with Either
values is pattern matching, in other words,
printResult' :: Either LoginError Text -> IO ()
printResult' domain =
case domain of
Right text -> T.putStrLn (append "Domain: " text)
Left InvalidEmail -> T.putStrLn "ERROR: Invalid domain"
Testing in the interpreter shows us that
ghci> printResult' (getDomain "test@example.com")
Domain: example.com
ghci> printResult' (getDomain "test#example.com")
ERROR: Invalid domain
Another way of dealing with Either
values is by using the either
function. either
has the type signature
In other words, it “unpacks” the Either
value and applies one of the two functions to get a c
value back. In this program, we have an Either LoginError Text
and we want just a Text
back, which tells us what to print. So we can view the signature of either
as
and writing printResult
with the help of either
yields a pretty neat function.
printResult :: Either LoginError Text -> IO ()
printResult = T.putStrLn . either
(const "ERROR: Invalid domain")
(append "Domain: ")
This function works the same way as the previous one, except with the pattern matching hidden inside the call to either
.
Introducing Side-Effects
Git From Commit:
538140635ec8be16a7be0aae85c1916bced79fa4
Git Until Commit:
349c427a292d5e867c088bab22d030c7eec78fed
Now we’ll use the domain as some sort of “user token” – a value the user uses to prove they have authenticated. This means we need to ask the user for their email address and return the associated token.
getToken :: IO (Either LoginError Text)
getToken = do
T.putStrLn "Enter email address:"
email <- T.getLine
return (getDomain email)
-- end snippet printResult
-- start snippet users
users :: Map Text Text
users = Map.fromList [("example.com", "qwerty123"), ("localhost", "password")]
-- end snippet users
-- start snippet userLogin
userLogin :: IO (Either LoginError Text)
userLogin = do
token <- getToken
case token of
Right domain ->
case Map.lookup domain users of
Just userpw -> do
T.putStrLn "Enter password:"
password <- T.getLine
if userpw == password
then return token
else return (Left WrongPassword)
Nothing -> return (Left NoSuchUser)
left -> return left
-- end snippet userLogin
So when getToken
runs, it’ll get an email address from the user and return the domain of the email address.
λ> getToken
Enter email address:
test@example.com
Right "example.com"
and, importantly,
λ> getToken
Enter email address:
not.an.email.address
Left InvalidEmail
Now, let’s complete this with an authentication system. We’ll have two users who both have terrible passwords:
users :: Map Text Text
users = Map.fromList [("example.com", "qwerty123"), ("localhost", "password")]
With an authentication system, we can also run into two new kinds of errors, so let’s change our LoginError
data type to reflect that.
We also need to write the actual authentication function. Here we go…
userLogin :: IO (Either LoginError Text)
userLogin = do
token <- getToken
case token of
Right domain ->
case Map.lookup domain users of
Just userpw -> do
T.putStrLn "Enter password:"
password <- T.getLine
if userpw == password
then return token
else return (Left WrongPassword)
Nothing -> return (Left NoSuchUser)
left -> return left
checks that the email was processed without problems, finds the user in the collection of users, and if the passwords match, it returns the token to show the user is of authenticated.
If anything goes wrong, such as the passwords not matching, there not being a user with the entered domain, or the getToken
function failing to process, then a Left
value will be returned.
This function is not something we want to deal with. It’s big, it’s bulky, it has several layers of nesting… it’s not the Haskell we know and love.
Sure, it’s possible to rewrite it using function calls to either
and maybe
, but that wouldn’t help very much. The real reason the code is this ugly is that we’re trying to mix both Either
and IO
, and they don’t seem to blend well.
The core of the problem is that the IO
monad is designed for dealing with IO
actions, and it’s terrible at handling errors. On the other hand, the Either
monad is great at handling errors, but it can’t do IO
. So let’s explore what happens if you imagine a monad that is designed to both handle errors and IO
actions.
Too good to be true? Read on and find out.
We Can Make Our Own Monads
We keep coming across the IO (Either e a)
type, so maybe there is something special about that. What happens if we make a Haskell data type out of that combination?
What did we get just by doing this? Let’s see:
ghci> :l src/Main.hs
ghci> :type EitherIO
EitherIO :: IO (Either e a) -> EitherIO e a
ghci> :type runEitherIO
runEitherIO :: EitherIO e a -> IO (Either e a)
So already we have a way to go between our own type and the combination we used previously! That’s gotta be useful somehow.
Implementing Instances for Common Typeclasses
This section might be a little difficult if you’re new to the language and haven’t had a lot of exposure to the internals of how common typeclasses work. You don’t need to understand this section to continue reading the article, but I strongly suggest you put on your to-do list to learn enough to understand this section. It touches on many of the core components of what makes Haskell Haskell and not just another functional language.
If you want to read more about this kind of thing, [The Typeclassopedia][2] by Brent Yorgey is a comprehensive reference of the most common typeclasses used in Haskell code. Learn You a Haskell has a popular [introduction to the three typeclasses we use here][3]. Additionally, Adit provides us with [a humourous picture guide to the same three typeclasses][4].
But before we do anything else, let’s make EitherIO
a functor, an applicative and a monad, starting with the functor, of course.
instance Functor (EitherIO e) where
fmap f ex = wrapped
where
unwrapped = runEitherIO ex
fmapped = fmap (fmap f) unwrapped
wrapped = EitherIO fmapped
This may look a little silly initially, but it does make sense. First, we “unwrap” the EitherIO
type to expose the raw IO (Either e a)
value. Then we fmap
over the inner a
, by combining two fmap
s. Then we wrap the new value up in EitherIO
again, and return the wrapped value. If you are a more experienced Haskell user, you might prefer the following, equivalent, definition instead.
In a sense, that definition makes it more clear that you are just unwrapping, running a function on the inner value, and then wrapping it together again.
The two other instances are more of the same, really. Creating them is a mostly mechanical process of following the types and unwrapping and wrapping our custom type. I challenge the reader to come up with these instances on their own before looking below how I did it, because trying to figure these things out will improve your Haskell abilities in the long run.
In any case, explaining them gets boring, so I’ll just show you the instances as an experienced Haskell user might write them.
instance Applicative (EitherIO e) where
pure = EitherIO . return . Right
f <*> x = EitherIO $ liftA2 (<*>) (runEitherIO f) (runEitherIO x)
instance Monad (EitherIO e) where
return = pure
x >>= f = EitherIO $ runEitherIO x >>= either (return . Left) (runEitherIO . f)
If your definitions look nothing like these, don’t worry. As long as your definitions give the correct results, they are just as good as mine. There are many ways to write these definitions, and none is better than the other as long as all are correct.
No comments, yet!