## Differentiation of higher-order types

A “one-hole context” is a data structure with one piece missing. Conor McBride pointed out that the derivative of a regular type is its type of one-hole contexts. When a data structure is assembled out of common functor combinators, a corresponding type of one-hole contexts can be derived mechanically by rules that mirror the standard derivative rules learned in beginning differential calculus.

I’ve been playing with functor combinators lately. I was delighted to find that the data-structure derivatives can be expressed directly using the standard functor combinators and type families.

The code in this blog post is taken from the Haskell library functor-combo.

See also the Haskell Wikibooks page on zippers, especially the section called “Differentiation of data types”.

I mean this post not as new research, but rather as a tidy, concrete presentation of some of Conor’s delightful insight.

### Functor combinators

Let’s use the same set of functor combinators as in *Elegant memoization with higher-order types* and *Memoizing higher-order functions*:

```
data Void a -- no constructors
type Unit a = Const () a
data Const x a = Const x
newtype Id a = Id a
data (f :+: g) a = InL (f a) | InR (g a)
data (f :*: g) a = f a :*: g a
newtype (g :. f) a = O (g (f a))
```

### Derivatives

The derivative of a functor is another functor. Since the shape of the derivative is non-uniform (depends on the shape of the functor being differentiated) define a higher-order type family:

```
type family Der (f :: (* → *)) :: (* → *)
```

The usual derivative rules can then be translated without applying much imagination.
That is, if we start with derivative rules in their *functional* form (e.g., as in the paper *Beautiful differentiation*, Section 2 and Figure 1).

For instance, the derivative of the constant function is the constant 0 function, and the derivative of the identity function is the constant 1 function.
If `der`

is the derivative functional mapping functions (of real numbers) to functions,

```
der (const x) ≡ 0
der id ≡ 1
```

On the right-hand sides, I am exploiting the function instances of `Num`

from the library applicative-numbers.
To be more explicit, I could have written “`const 0`

” and “`const 1`

“.

Correspondingly,

```
type instance Der (Const x) = Void -- 0
type instance Der Id = Unit -- 1
```

Note that the types `Void a`

and `Unit a`

have 0 and 1 element, respectively, if we ignore ⊥.
Moreover, `Void`

is a sort of additive identity, and `Unit`

is a sort of multiplicative identity, again ignoring ⊥.
For these reasons, `Void`

and `Unit`

might be more aptly named “`Zero`

” and “`One`

“.

The first rule says that the a value of type `Const x a`

has no one-hole context (for type `a`

), which is true, since there is an `x`

but no `a`

.
The second rule says that there is exactly one possible context for `Id a`

, since the one and only `a`

value must be removed, and no information remains.

A (one-hole) context for a sum is a context for the left or the right possibility of the sum:

```
type instance Der (f :+: g) = Der f :+: Der g
```

Correspondingly, the derivative of a sum of functions is the sum of the functions’ derivatives::

```
der (f + g) ≡ der f + der g
```

Again I’m using the function `Num`

instance from applicative-numbers.

For a pair, the one hole of a context can be made somewhere in the first component or somewhere in the second component. So the pair context consists of a holey first component and a full second component or a full first component and a holey second component.

```
type instance Der (f :*: g) = Der f :*: g :+: f :*: Der g
```

Similarly, for functions:

```
der (f * g) ≡ der f * g + f * der g
```

Finally, consider functor composition.
If `g`

and `f`

are container types, then `(g :. f) a`

is the type of `g`

containers of `f`

containers of `a`

elements.
The `a`

-shaped hole must come from one of the contained `f a`

structures.

```
type instance Der (g :. f) = (Der g :. f) :*: Der f
```

Here’s one way to think of this derivative functor:
to make an `a`

-shaped hole in a `g (f a)`

, first remove an `f a`

structure, leaving an `(f a)`

-shaped hole, and then put back all but an `a`

value extracted from the removed `f a`

struture.
So the overall (one-hole) context can be assembled from two parts: a `g`

context of `f a`

structures, and an `f`

context of `a`

values.

The corresponding rule for function derivatives:

```
der (g ∘ f) ≡ (der g ∘ f) * der f
```

which again uses `Num`

on functions.
Written out more explicitly:

```
der (g ∘ f) a ≡ der g (f a) * der f a
```

which may look more like the form you’re used to.

### Summary of derivatives

To emphasize the correspondence between forms of differentiation, here are rules for *function* and *functor* derivatives:

```
der (const x) ≡ 0
Der (Const x) ≡ Void
der id ≡ 1
Der Id ≡ Unit
der (f + g) ≡ der f + der g
Der (f :+: g) ≡ Der f :+: Der g
der (f * g) ≡ der f * g + f * der g
Der (f :*: g) ≡ Der f :*: g :+: f :*: Der g
der (g ∘ f) ≡ (der g ∘ f) * der f
Der (g :. f) ≡ (Der g :. f) :*: Der f
```

### Filling holes

Each derivative functor is a one-hole container. One useful operation on derivatives is filling that hole.

```
fillC :: Functor f ⇒ Der f a → a → f a
```

The specifics of how to fill in a hole will depend on the choice of functor `f`

, so let’s make the `fillC`

operation a method of a new type class.
This new class is also a handy place to stash the associated type of derivatives, as an alternative to the top-level declarations above.

```
class Functor f ⇒ Holey f where
type Der f :: * → *
fillC :: Der f a → a → f a
```

I’ll add one more method to this class in an upcoming post.

For `Const x`

, there are no cases to handle, since there are no holes.

```
instance Holey (Const x) where
type Der (Const x) = Void
fillC = error "fillC for Const x: no Der values"
```

I added a definition just to keep the compiler from complaining.
This particular `fillC`

can only be applied to a value of type `Void a`

, and there are no such values other than ⊥.

Is there a more elegant way to define functions over data types with no constructors? One idea is to provide a single, polymorphic function over void types:

```
voidF :: Void a → b
voidF = error "voidF: no value of type Void"
```

And use whenever as needed, e.g.,

```
fillC = voidF
```

Next is our identity functor:

```
instance Holey Id where
type Der Id = Unit
fillC (Const ()) a = Id a
```

More succinctly,

```
fillC (Const ()) = Id
```

For sums,

```
instance (Holey f, Holey g) ⇒ Holey (f :+: g) where
type Der (f :+: g) = Der f :+: Der g
fillC (InL df) a = InL (fillC df a)
fillC (InR df) a = InR (fillC df a)
```

or

```
fillC (InL df) = InL ∘ fillC df
fillC (InR df) = InR ∘ fillC df
```

Products also have two cases, since the derivative of a product is a sum:

```
instance (Holey f, Holey g) ⇒ Holey (f :*: g) where
type Der (f :*: g) = Der f :*: g :+: f :*: Der g
fillC (InL (dfa :*: ga)) a = fillC dfa a :*: ga
fillC (InR ( fa :*: dga)) a = fa :*: fillC dga a
```

Less pointfully,

```
fillC (InL (dfa :*: ga)) = (:*: ga) ∘ fillC dfa
fillC (InR ( fa :*: dga)) = (fa :*:) ∘ fillC dga
```

Finally, functor composition:

```
instance (Holey f, Holey g) ⇒ Holey (g :. f) where
type Der (g :. f) = (Der g :. f) :*: Der f
fillC (O dgfa :*: dfa) a = O (fillC dgfa (fillC dfa a))
```

The less pointful form is more telling.

```
fillC (O dgfa :*: dfa) = O ∘ fillC dgfa ∘ fillC dfa
```

In words: filling of the derivative of a composition is a composition of filling of the derivatives.

### Thoughts on composition

Let’s return to the derivative rules for composition, i.e., the chain rule, on functions and on functors:

```
der (g ∘ f) ≡ (der g ∘ f) * der f
Der (g :. f) ≡ (Der g :. f) :*: Der f
```

Written in this way, the functor rule looks quite compelling.
Something bothers me, however.
For functions, multiplication is a special case, not the general case, and is only meaningful and correct when differentiating functions from scalars to scalars.
In general, derivative values are *linear maps*, and the chain rule uses composition on linear maps rather than multiplication on scalars (that *represent* linear maps).
I’ve written several posts on derivatives and a paper *Beautiful differentiation*, describing this perspective, which comes from calculus on manifolds.

Look again at the less pointful formulation of `fillC`

for derivatives of compositions:

```
fillC (O dgfa :*: dfa) = O ∘ fillC dgfa ∘ fillC dfa
```

The product in this case is just structural.
The actual use in `fillC`

is indeed a composition of linear maps.
In this context, “linear” has a different meaning from before.
It’s another way of saying “fills a *one-hole* context” (as the linear patterns of term rewriting and of ML & Haskell).

So maybe there’s a more general/abstract view of *functor* derivatives, just as there is a more general/abstract view of *function* derivatives.
In that view, we might replace the functor chain rule’s product with a notion of composition.

## Paul Brauner:

Hi, nice post! I’ve tried to compile the last version of functor-combo (0.0.3) but one file seems to be missing:

cabal: can’t find source for FunctorCombo/StrictMemo in src

28 July 2010, 9:32 pm## Conal Elliott » Blog Archive » Another angle on zippers:

[…] About « Differentiation of higher-order types […]

29 July 2010, 9:15 am## conal:

Thanks, Paul. Fixed now. Please try again.

29 July 2010, 9:45 am## monoidal:

I like the analogy between value-level and type-level differentation. In Omega programming language, level polymorphism is supported, and corresponding values and types can be defined simultaneously. (The definition gives also automatically kinds, sorts etc.). Unfortunately I am unable to define value functions and type functions with one definition, so I have to give analogous definitions. http://hpaste.org/40138/level_polymorphism_in_omega

26 September 2010, 5:21 am