I read Max Rabkin’s recent post Beautiful folding with great excitement. He shows how to make combine multiple folds over the same list into a single pass, which can then drastically reduce memory requirements of a lazy functional program. Max’s trick is giving folds a data representation and a way to combine representations that corresponds to combining the folds.
Peeking out from behind Max’s definitions is a lovely pattern I’ve been noticing more and more over the last couple of years, namely type class morphisms.
Folds as data
Max gives a data representation of folds and adds on an post-fold step, which makes them composable.
data Fold b c = forall a. F (a -> b -> a) a (a -> c)
The components of a
Fold are a (strict left) fold’s combiner function and initial value, plus a post-fold step.
This interpretation is done by a function
cfoldl', which turns these data folds into function folds:
cfoldl' :: Fold b c -> [b] -> c cfoldl' (F op e k) = k . foldl' op e
foldl' is the standard strict, left-fold functional:
As Twan van Laarhoven pointed out in a comment on Max’s post,
Fold b is a functor and an applicative functor, so some of Max’s
Fold-manipulating functions can be replaced by standard vocabulary.
Functor instance is pretty simple:
Applicative instance is a bit trickier.
For strictness, Max used used a type of strict pairs:
data Pair c c' = P !c !c'
Fold b is an applicative functor, Max’s
bothWith function is then
multi-cfoldl' rule then becomes:
forall c f g xs. h (cfoldl' f xs) (cfoldl' g xs) == cfoldl' (liftA2 h f g) xs
thus replacing two passes with one pass.
Now here’s the fun part.
Looking at the
Applicative instance for
((->) a), the rule above is equivalent to
forall c f g. liftA2 h (cfoldl' f) (cfoldl' g) == cfoldl' (liftA2 h f g)
Flipped around, this rule says that
liftA2 distributes over
Or, “the meaning of
liftA2 property is equivalent to the following:
forall f g. cfoldl' f <*> cfoldl' g == cfoldl' (f <*> g)
This form is one of the two
Applicative morphism laws (which I usually write in the reverse direction):
For more about these morphisms, see Simplifying semantics with type class morphisms.
That post suggests that semantic functions in particular ought to be type class morphisms (and if not, then you’d have an abstraction leak).
cfoldl' is a semantic function, in that it gives meaning to a
The other type class morphisms in this case are
Applicative instances of
((->) a), these two properties are equivalent to
cfoldl' (pure a ) xs == a cfoldl' (fmap h f) xs == h (cfoldl' f xs)
Max pointed out that GHC does not handle his original
The reason is that the head of the LHS (left-hand side) is a variable.
However, the type class morphism laws have constant (known) functions at the head, so I expect they could usefully act as fusion rewrite rules.
Given the implementations (instances) of
Fold, I’d like to verify that the morphism laws for
cfoldl' (above) hold.
The morphism law:
First, give the
Fold argument more structure, so that (without loss of generality) the law becomes
The game is to work backward from this law to the definition of
I’ll do so by massaging the RHS (right-hand side) into the form
cfoldl' (...), where “
...” is the definition
fmap h (F op e k).
This proof show why Max had to add the post-fold function
k to his
k weren’t there, we couldn’t have buried the
h in it.
More usefully, this proof suggests how we could have discovered the
For instance, we might have tried with a simpler and more obvious
data FoldS a b = FS (a -> b -> a) a
Getting into the
fmap derivation, we’d come to
h . foldsl' op e
and then we’d be stuck.
But not really, because the awkward extra bit (
h .) beckons us to generalize by adding Max’s post-fold function.
cfoldl' (pure a) == pure a
Reason as before, starting with the RHS
The imaginative step was inventing structure to match the definition of
This definition is true for any values of
e, so we can use bottom in the definition:
As Twan noticed, the existential (
forall) also lets us pick defined values for
(\_ _ -> ()) and
The derivation of
(<*>) is trickier and is the heart of the problem of fusing folds to reduce multiple traversals to a single one.
Why the heart? Because
(<*>) is all about combining two things into one.
I’m taking a break here.
While fiddling with a proof of the
(<*>) morphism law, I realized a simpler way to structure these folds, which will be the topic of an upcoming post.