Another lovely example of type class morphisms
I read Max Rabkin’s recent post Beautiful folding with great excitement. He shows how to make combine multiple folds over the same list into a single pass, which can then drastically reduce memory requirements of a lazy functional program. Max’s trick is giving folds a data representation and a way to combine representations that corresponds to combining the folds.
Peeking out from behind Max’s definitions is a lovely pattern I’ve been noticing more and more over the last couple of years, namely type class morphisms.
Folds as data
Max gives a data representation of folds and adds on an post-fold step, which makes them composable.
data Fold b c = forall a. F (a -> b -> a) a (a -> c)
The components of a Fold
are a (strict left) fold’s combiner function and initial value, plus a post-fold step.
This interpretation is done by a function cfoldl'
, which turns these data folds into function folds:
cfoldl' :: Fold b c -> [b] -> c
cfoldl' (F op e k) = k . foldl' op e
where foldl'
is the standard strict, left-fold functional:
foldl' :: (a -> b -> a) -> a -> [b] -> a
foldl' op a [] = a
foldl' op a (b:bs) =
let a' = a `op` b in a' `seq` foldl' f a' bs
Standard classes
As Twan van Laarhoven pointed out in a comment on Max’s post, Fold b
is a functor and an applicative functor, so some of Max’s Fold
-manipulating functions can be replaced by standard vocabulary.
The Functor
instance is pretty simple:
instance Functor (Fold b) where
fmap h (F op e k) = F op e (h . k)
The Applicative
instance is a bit trickier.
For strictness, Max used used a type of strict pairs:
data Pair c c' = P !c !c'
The instance:
instance Applicative (Fold b) where
pure a = F (error "no op") (error "no e") (const a)
F op e k <*> F op' e' k' = F op'' e'' k''
where
P a a' `op''` b = P (a `op` b) (a' `op'` b)
e'' = P e e'
k'' (P a a') = (k a) (k' a')
Given that Fold b
is an applicative functor, Max’s bothWith
function is then liftA2
.
Max’s multi-cfoldl'
rule then becomes:
forall c f g xs.
h (cfoldl' f xs) (cfoldl' g xs) == cfoldl' (liftA2 h f g) xs
thus replacing two passes with one pass.
Beautiful properties
Now here’s the fun part.
Looking at the Applicative
instance for ((->) a)
, the rule above is equivalent to
forall c f g.
liftA2 h (cfoldl' f) (cfoldl' g) == cfoldl' (liftA2 h f g)
Flipped around, this rule says that liftA2
distributes over cfoldl'
.
Or, “the meaning of liftA2
is liftA2
“.
Neat, huh?
Moreover, this liftA2
property is equivalent to the following:
forall f g.
cfoldl' f <*> cfoldl' g == cfoldl' (f <*> g)
This form is one of the two Applicative
morphism laws (which I usually write in the reverse direction):
For more about these morphisms, see Simplifying semantics with type class morphisms.
That post suggests that semantic functions in particular ought to be type class morphisms (and if not, then you’d have an abstraction leak).
And cfoldl'
is a semantic function, in that it gives meaning to a Fold
.
The other type class morphisms in this case are
cfoldl' (pure a ) == pure a
cfoldl' (fmap h f) == fmap h (cfoldl' f)
Given the Functor
and Applicative
instances of ((->) a)
, these two properties are equivalent to
cfoldl' (pure a ) == const a
cfoldl' (fmap h f) == h . cfoldl' f
or
cfoldl' (pure a ) xs == a
cfoldl' (fmap h f) xs == h (cfoldl' f xs)
Rewrite rules
Max pointed out that GHC does not handle his original multi-cfoldl'
rule.
The reason is that the head of the LHS (left-hand side) is a variable.
However, the type class morphism laws have constant (known) functions at the head, so I expect they could usefully act as fusion rewrite rules.
Inevitable instances
Given the implementations (instances) of Functor
and Applicative
for Fold
, I’d like to verify that the morphism laws for cfoldl'
(above) hold.
Functor
Start with fmap
.
The morphism law:
cfoldl' (fmap h f) == fmap h (cfoldl' f)
First, give the Fold
argument more structure, so that (without loss of generality) the law becomes
cfoldl' (fmap h (F op e k)) == fmap h (cfoldl' (F op e k))
The game is to work backward from this law to the definition of fmap
for Fold
.
I’ll do so by massaging the RHS (right-hand side) into the form cfoldl' (...)
, where “...
” is the definition fmap h (F op e k)
.
fmap h (cfoldl' (F op e k))
== {- inline cfoldl' -}
fmap h (k . foldl' op e)
== {- inline fmap on functions -}
h . (k . foldl' op e)
== {- associativity of (.) -}
(h . k) . foldl' op e
== {- uninline cfoldl' -}
cfoldl' (F op e (h . k))
== {- uninline fmap on Fold -}
cfoldl' (fmap h (F op e k))
This proof show why Max had to add the post-fold function k
to his Fold
type.
If k
weren’t there, we couldn’t have buried the h
in it.
More usefully, this proof suggests how we could have discovered the fmap
definition.
For instance, we might have tried with a simpler and more obvious Fold
representation:
data FoldS a b = FS (a -> b -> a) a
Getting into the fmap
derivation, we’d come to
h . foldsl' op e
and then we’d be stuck.
But not really, because the awkward extra bit (h .
) beckons us to generalize by adding Max’s post-fold function.
Applicative
Next pure:
cfoldl' (pure a) == pure a
Reason as before, starting with the RHS
pure a
== {- inline pure on functions -}
const a
== {- property of const -}
const a . foldl op e
== {- uninline cfoldl' -}
cfoldl' (F (const a) op e)
The imaginative step was inventing structure to match the definition of cfoldl'
.
This definition is true for any values of op
and e
, so we can use bottom in the definition:
instance Applicative (Fold b) where
pure a = F undefined undefined (const a)
As Twan noticed, the existential (forall
) also lets us pick defined values for op
and e
.
He chose (_ _ -> ())
and ()
.
The derivation of (<*>)
is trickier and is the heart of the problem of fusing folds to reduce multiple traversals to a single one.
Why the heart? Because (<*>)
is all about combining two things into one.
Intermission
I’m taking a break here.
While fiddling with a proof of the (<*>)
morphism law, I realized a simpler way to structure these folds, which will be the topic of an upcoming post.
Conal Elliott » Blog Archive » More beautiful fold zipping:
[…] About « Another lovely example of type class morphisms […]
15 November 2008, 1:24 pmConal Elliott » Blog Archive » Enhancing a Zip:
[…] Part One related Max’s data representation of left folds to type class morphisms, a pattern that’s been steering me lately toward natural (inevitable) design of libraries. Part Two simplified that representation to help get to the essence of zipping, and in doing so lost the expressiveness necessary to define Functorand Applicative instaces. Part Three proved the suitability of the zipping definitions in Part Two. […]
15 November 2008, 5:09 pm