Haskell/Foldable
The Foldable
type class provides a generalisation of list folding (foldr
and friends) and operations derived from it to arbitrary data structures. Besides being extremely useful, Foldable
is a great example of how monoids can help formulating good abstractions.
Deconstructing foldr
[edit | edit source]This paragraph will introduce foldMap
as a replacement of foldr
. We will show how to implement the latter in terms of the former.
foldr
is quite a busy function − two binary functions on each side of the first function arrow, with types which use two variables each.
foldr :: (a -> b -> b) -> b -> [a] -> b
If we are going to generalise foldr
, it would be convenient to have something simpler to work with, or at least to be able to break it down into simpler components. What could those components be?
A rough description of list folding would be that it consists of running through the list elements and combining them with a binary function. We happen to know one type class which is all about combining pairs of values: Monoid
. If we take foldr f z
...
a `f` (b `f` (c `f` z)) -- foldr f z [a,b,c]
... and make f = (<>)
and z = mempty
...
a <> (b <> (c <> mempty)) -- foldr (<>) mempty [a,b,c]
... we get mconcat = foldr mappend mempty
, which is a simpler, specialised foldr
in which we do not need to specify the combining function nor initial accumulator, as we simply use mappend
(i.e. (<>)
) and mempty
:
mconcat :: Monoid m => [m] -> m
mconcat
captures the combine-all-elements aspect of foldr
well enough, and covers a few of its use cases:
GHCi> mconcat ["Tree", "fingers"] -- concat
"Treefingers"
Neat − but surely we don't want to be restricted to folding with Monoid
instances only. One way to improve the situation a bit is by realising we can use mconcat
to fold a list with elements of any type, as long as we have a function to convert them to some Monoid
type:
foldMap :: Monoid m => (a -> m) -> [a] -> m
foldMap g = mconcat . fmap g
That makes things more interesting already:
GHCi> foldMap Sum [1..10]
Sum {getSum = 55}
So far so good, but it seems that we are still unable to fold with arbitrary combining functions. It turns out, however, that any binary function that fits the foldr
signature can be used to convert values to a Monoid
type! The trick is looking at the combining function passed to foldr
as a function of one argument...
foldr :: (a -> (b -> b)) -> b -> [a] -> b
... and taking advantage of the fact that b -> b
functions form a monoid under composition, with (.)
as mappend
and id
as mempty
[1]. The corresponding Monoid
instance is available through the Endo
wrapper from Data.Monoid
[2]:
newtype Endo b = Endo { appEndo :: b -> b }
instance Monoid Endo where
mempty = Endo id
Endo g `mappend` Endo f = Endo (g . f)
We can now define...
foldComposing :: (a -> (b -> b)) -> [a] -> Endo b
foldComposing f = foldMap (Endo . f)
... which makes a b -> b
function out of each element and composes them all:
Endo (f a) <> (Endo (f b) <> (Endo (f c) <> (Endo id))) -- foldComposing f [a,b,c]
Endo (f a . (f b . (f c . id)))
-- (<>) and (.) are associative, so we don't actually need the parentheses.
-- As an example, here is a step-by-step evaluation:
foldComposing (+) [1, 2, 3]
foldMap (Endo . (+)) [1, 2, 3]
mconcat (fmap (Endo . (+)) [1, 2, 3])
mconcat (fmap Endo [(+1), (+2), (+3)])
mconcat [Endo (+1), Endo (+2), Endo (+3)]
Endo ((+1) . (+2) . (+3))
Endo (+6)
If we apply that function to some b
value...
foldr :: (a -> (b -> b)) -> b -> [a] -> b
foldr f z xs = appEndo (foldComposing f xs) z
...we finally recover foldr
. That means we can define foldr
in terms of foldMap
, a function which is much simpler and therefore easier to reason about. For that reason, foldMap
is the conceptual heart of Foldable
, the class which generalises foldr
to arbitrary data structures.
Exercises |
---|
|
The Foldable
class
[edit | edit source]Implementing Foldable
for a data structure requires writing just one function: either foldMap
or foldr
. Foldable
, however, has a lot of other methods:
-- Abridged definition, with just the method signatures.
class Foldable t where
foldMap :: Monoid m => (a -> m) -> t a -> m
foldr :: (a -> b -> b) -> b -> t a -> b
-- All of the following have default implementations:
fold :: Monoid m => t m -> m -- generalised mconcat
foldr' :: (a -> b -> b) -> b -> t a -> b
foldl :: (b -> a -> b) -> b -> t a -> b
foldl' :: (b -> a -> b) -> b -> t a -> b
foldr1 :: (a -> a -> a) -> t a -> a
foldl1 :: (a -> a -> a) -> t a -> a
toList :: t a -> [a]
null :: t a -> Bool
length :: t a -> Int
elem :: Eq a => a -> t a -> Bool
maximum :: Ord a => t a -> a
minimum :: Ord a => t a -> a
sum :: Num a => t a -> a
product :: Num a => t a -> a
The extra methods are there so that more efficient implementations can be written if necessary. In any case, writing just foldMap
or foldr
gives you all of the very useful functions listed above for free. And it gets even better: Data.Foldable provides still more functions generalised to any Foldable
, including, remarkably, mapM_
/traverse_
.
Here is a quick demonstration of Foldable
using Data.Map [3]:
GHCi> import qualified Data.Map as M
GHCi> let testMap = M.fromList $ zip [0..] ["Yesterday","I","woke","up","sucking","a","lemon"]
GHCi> length testMap
7
GHCi> sum . fmap length $ testMap
29
GHCi> elem "lemon" testMap
True
GHCi> foldr1 (\x y -> x ++ (' ' : y)) testMap -- Be careful: foldr1 is partial!
"Yesterday I woke up sucking a lemon"
GHCi> import Data.Foldable
GHCi> traverse_ putStrLn testMap
Yesterday
I
woke
up
sucking
a
lemon
Beyond providing useful generalisations, Foldable
and foldMap
suggest a more declarative way of thinking about folds. For instance, instead of describing sum
as a function which runs across a list (or tree, or whatever the data structure is) accumulating its elements with (+)
, we might say that it queries each element for its value and summarises the results of the queries using the Sum
monoid. Though the difference may seem small, the monoidal summary perspective can help clarifying problems involving folds by separating the core issue of what sort of result we wish to obtain from the details of the data structure being folded.
Exercises |
---|
|
List-like folding
[edit | edit source]Foldable
includes the toList :: Foldable t => t a -> [a]
method. That means any Foldable
data structure can be turned into a list; moreover, folding the resulting list will produce the same results as folding the original structure directly. A possible toList
implementation in terms of foldMap
would be [4]:
toList = foldMap (\x -> [x])
toList
reflects the fact that lists are the free monoid for Haskell types. "Free" here means any value can be promoted to the monoid in a way which neither adds nor erases any information (we can convert values of type a
to [a]
lists with a single element and back through (\x->[x])
and head
in a lossless way) [5].
A related key trait of Foldable
is made obvious by toList
. Since toList = id
for lists, if you are given a function defined as...
-- Given a list xs :: [a]
xsAsFoldMap :: Monoid m => (a -> m) -> m
xsAsFoldMap = \f -> foldMap f xs
... it is always possible to recover the original list xs
by supplying (\x->[x])
to xsAsFoldMap
. In this sense, lists are equivalent to their right folds. That implies folding with the Foldable
operations will unavoidably be a lossy operation if the data structure is more complex than a list. Putting it in another way, we can say that the list-like folds offered by Foldable
are less general than folds of the sort we have seen back in Other data structures (formally known as catamorphisms), which do make it possible to reconstruct the original structure.
Exercises |
---|
|
More facts about Foldable
[edit | edit source]Foldable
is slightly unusual among Haskell classes which are both principled and general-purpose in that it has no laws of its own. The closest thing is the following property, which strictly speaking is not a law (as it is guaranteed to hold whatever the instance is): given a monoid homomorphism g
,
foldMap (g . f) = g . foldMap f
Switching from foldMap (g . f)
to g . foldMap f
can be advantageous, as it means applying g
only to the result of the fold, rather than to the potentially many elements in the structure being folded.
If the Foldable
structure is a Functor
as well, it also automatically holds that...
foldMap f = fold . fmap f
... and thus we get, after applying the second functor law and the property just above:
foldMap g . fmap f = foldMap (g . f) = g . foldMap f
Though the presence of a method such as foldMap
might suggest that any Foldable
types should have Functor
instances as well, Functor
is not actually a superclass of Foldable
. That makes it possible to give Foldable
instances to structures that, for whatever reason, cannot be Functor
s. The most common example are the sets from Data.Set. Element types for those sets must be instances of Ord
, and therefore their map
function cannot be used as fmap
, which has no additional class constraints. That, however, does not deny Data.Set.Set
an useful Foldable
instance.
GHCi> import qualified Data.Set as S
GHCi> let testSet = S.fromList [1,3,2,5,5,0]
GHCi> testSet
fromList [0,1,2,3,5]
GHCi> import Data.Foldable
GHCi> toList testSet
[0,1,2,3,5]
GHCi> foldMap show testSet
"01235"
Exercises |
---|
|
Notes
- ↑ This trick will probably ring familiar if you did the exercise about
foldl
at the end of Higher order functions. - ↑ "Endo" is shorthand for "endomorphism", a jargony word for functions from one type to the same type.
- ↑ For more information on
Data.Map
and other useful data structure implementations with, see the data structures primer. - ↑
Data.Foldable
uses a different default implementation for performance reasons. - ↑ There is one caveat relating to non-termination with regards to saying lists form a free monoid. For details, see the Free Monoids in Haskell post by Dan Doel. (Note that the discussion there is quite advanced. You might not enjoy it much right now if you have just been introduced to
Foldable
.) - ↑ Source (Haskell Café): [1]