## Another lovely example of type class morphisms

I read Max Rabkin’s recent post Beautiful folding with great excitement. He shows how to make combine multiple folds over the same list into a single pass, which can then drastically reduce memory requirements of a lazy functional program. Max’s trick is giving folds a data representation and a way to combine representations that corresponds to combining the folds.

Peeking out from behind Max’s definitions is a lovely pattern I’ve been noticing more and more over the last couple of years, namely type class morphisms.

### Folds as data

Max gives a data representation of folds and adds on an post-fold step, which makes them composable.

``````data Fold b c = forall a. F (a -> b -> a) a (a -> c)
``````

The components of a `Fold` are a (strict left) fold’s combiner function and initial value, plus a post-fold step. This interpretation is done by a function `cfoldl'`, which turns these data folds into function folds:

``````cfoldl' :: Fold b c -> [b] -> c
cfoldl' (F op e k) = k . foldl' op e
``````

where `foldl'` is the standard strict, left-fold functional:

``````foldl' :: (a -> b -> a) -> a -> [b] -> a
foldl' op a []     = a
foldl' op a (b:bs) =
let a' = a `op` b in a' `seq` foldl' f a' bs
``````

### Standard classes

As Twan van Laarhoven pointed out in a comment on Max’s post, `Fold b` is a functor and an applicative functor, so some of Max’s `Fold`-manipulating functions can be replaced by standard vocabulary.

The `Functor` instance is pretty simple:

``````instance Functor (Fold b) where
fmap h (F op e k) = F op e (h . k)
``````

The `Applicative` instance is a bit trickier. For strictness, Max used used a type of strict pairs:

``````data Pair c c' = P !c !c'
``````

The instance:

``````instance Applicative (Fold b) where
pure a = F (error "no op") (error "no e") (const a)

F op e k <*> F op' e' k' = F op'' e'' k''
where
P a a' `op''` b = P (a `op` b) (a' `op'` b)
e''             = P e e'
k'' (P a a')    = (k a) (k' a')
``````

Given that `Fold b` is an applicative functor, Max’s `bothWith` function is then `liftA2`. Max’s `multi-cfoldl'` rule then becomes:

``````forall c f g xs.
h (cfoldl' f xs) (cfoldl' g xs) == cfoldl' (liftA2 h f g) xs
``````

thus replacing two passes with one pass.

### Beautiful properties

Now here’s the fun part. Looking at the `Applicative` instance for `((->) a)`, the rule above is equivalent to

``````forall c f g.
liftA2 h (cfoldl' f) (cfoldl' g) == cfoldl' (liftA2 h f g)
``````

Flipped around, this rule says that `liftA2` distributes over `cfoldl'`. Or, “the meaning of `liftA2` is `liftA2`“. Neat, huh?

Moreover, this `liftA2` property is equivalent to the following:

``````forall f g.
cfoldl' f <*> cfoldl' g == cfoldl' (f <*> g)
``````

This form is one of the two `Applicative` morphism laws (which I usually write in the reverse direction):

For more about these morphisms, see Simplifying semantics with type class morphisms. That post suggests that semantic functions in particular ought to be type class morphisms (and if not, then you’d have an abstraction leak). And `cfoldl'` is a semantic function, in that it gives meaning to a `Fold`.

The other type class morphisms in this case are

``````cfoldl' (pure a  ) == pure a

cfoldl' (fmap h f) == fmap h (cfoldl' f)
``````

Given the `Functor` and `Applicative` instances of `((->) a)`, these two properties are equivalent to

``````cfoldl' (pure a  ) == const a

cfoldl' (fmap h f) == h . cfoldl' f
``````

or

``````cfoldl' (pure a  ) xs == a

cfoldl' (fmap h f) xs == h (cfoldl' f xs)
``````

### Rewrite rules

Max pointed out that GHC does not handle his original `multi-cfoldl'` rule. The reason is that the head of the LHS (left-hand side) is a variable. However, the type class morphism laws have constant (known) functions at the head, so I expect they could usefully act as fusion rewrite rules.

### Inevitable instances

Given the implementations (instances) of `Functor` and `Applicative` for `Fold`, I’d like to verify that the morphism laws for `cfoldl'` (above) hold.

#### Functor

Start with `fmap`. The morphism law:

``````cfoldl' (fmap h f) == fmap h (cfoldl' f)
``````

First, give the `Fold` argument more structure, so that (without loss of generality) the law becomes

``````cfoldl' (fmap h (F op e k)) == fmap h (cfoldl' (F op e k))
``````

The game is to work backward from this law to the definition of `fmap` for `Fold`. I’ll do so by massaging the RHS (right-hand side) into the form `cfoldl' (...)`, where “`...`” is the definition `fmap h (F op e k)`.

``````fmap h (cfoldl' (F op e k))

==  {- inline cfoldl' -}

fmap h (k . foldl' op e)

==  {- inline fmap on functions -}

h . (k . foldl' op e)

==  {- associativity of (.) -}

(h . k) . foldl' op e

==  {- uninline cfoldl' -}

cfoldl' (F op e (h . k))

==  {- uninline fmap on Fold  -}

cfoldl' (fmap h (F op e k))
``````

This proof show why Max had to add the post-fold function `k` to his `Fold` type. If `k` weren’t there, we couldn’t have buried the `h` in it.

More usefully, this proof suggests how we could have discovered the `fmap` definition. For instance, we might have tried with a simpler and more obvious `Fold` representation:

``````data FoldS a b = FS (a -> b -> a) a
``````

Getting into the `fmap` derivation, we’d come to

``````h . foldsl' op e
``````

and then we’d be stuck. But not really, because the awkward extra bit (`h .`) beckons us to generalize by adding Max’s post-fold function.

#### Applicative

Next pure:

``````cfoldl' (pure a) == pure a
``````

Reason as before, starting with the RHS

``````pure a

==  {- inline pure on functions -}

const a

==  {- property of const -}

const a . foldl op e

==  {- uninline cfoldl' -}

cfoldl' (F (const a) op e)
``````

The imaginative step was inventing structure to match the definition of `cfoldl'`. This definition is true for any values of `op` and `e`, so we can use bottom in the definition:

``````instance Applicative (Fold b) where
pure a = F undefined undefined (const a)
``````

As Twan noticed, the existential (`forall`) also lets us pick defined values for `op` and `e`. He chose `(_ _ -> ())` and `()`.

The derivation of `(<*>)` is trickier and is the heart of the problem of fusing folds to reduce multiple traversals to a single one. Why the heart? Because `(<*>)` is all about combining two things into one.

#### Intermission

I’m taking a break here. While fiddling with a proof of the `(<*>)` morphism law, I realized a simpler way to structure these folds, which will be the topic of an upcoming post.