## Memoizing polymorphic functions via unmemoization

Last year I wrote two posts on memoizing polymorphic functions. The first post gave a solution of sorts that uses a combination of patricia trees (for integer-keyed maps), stable names, and type equality proofs. The second post showed how to transform some functions that do not quite fit the required form so that they do fit.

Dan Piponi wrote a follow-up post Memoizing Polymorphic Functions with High School Algebra and Quantifiers showing a different approach that was more in the spirit of type-directed functional memoization, as it follows purely from mathematical properties, free of the deeply operational magic of stable names. Recently, I finally studied and worked with Dan’s post enough to understand what he did. It’s very clever and beautiful!

This post re-presents Dan’s elegant insight as I understand it, via some examples that helped it come together for me.

### Playing with quantifiers

Consider the type of `fmap`:

``````fmap :: Functor f ⇒ (a → b) → f a → f b
``````

and make the type quantification explicit

``````fmap :: ∀ f a b. Functor f ⇒ (a → b) → f a → f b
``````

Since the `Functor` constraint depends only on `f`, restrict the scope of the other quantifiers:

``````fmap :: ∀ f. Functor f ⇒ ∀ a b. (a → b) → f a → f b
``````

We cannot further restrict the scope of `a` or `b`, because the first argument of the function involves both `a` and `b`. The second argument, however, ignores `b`, so let’s next `flip` the arguments.

``````flip fmap :: ∀ f. Functor f ⇒ ∀ a b. f a → (a → b) → f b
``````

and restrict the scope of `b`:

``````flip fmap :: ∀ f. Functor f ⇒ ∀ a. f a → ∀ b. (a → b) → f b
``````

Next introduce names:

``````type Yo f a = ∀ b. (a → b) → f b

toYo :: Functor f ⇒ ∀ a. f a → Yo f a
toYo = flip fmap
``````

So `flip fmap` is a way to map `f a` to `Yo f a`. The reverse is easier:

``````fromYo :: ∀ a. Yo f a → f a
fromYo f = f id
``````

GHC’s type-checker won’t really let this `toYo` definition fly, because of persnickety details about universal quantification. Instead, inline `flip` and simplify:

``````toYo x p = fmap p x
``````

### The Yoneda lemma

Are the functions `fromYo` and `toYo` inverses of each other? The Yoneda lemma from category theory says yes. The following proof is taken from Dan’s post:

``````  fromYo (toYo x)
≡ fromYo (λ p → fmap p x)
≡ (λ p → fmap p x) id
≡ fmap id x
≡ x                    -- a functor law

toYo (fromYo f)
≡ toYo (f id)
≡ λ p → fmap p (f id)
≡ λ p → f (fmap p id)  -- f is polymorphic/natural
≡ λ p → f (p ∘ id)     -- fmap on functions is (∘)
≡ λ p → f p
≡ f
``````

Note the critical role of naturality, which holds of parametrically polymorphic functions in Haskell. To say that a function `f` is natural means that it commutes with `fmap p`, for all functions `p`, i.e.,

``````fmap p ∘ f  ≡  f ∘ fmap p
``````

## Playing with the Yoneda lemma

With a little type massaging, the Yoneda lemma can be applied more broadly than its literal form. For instance, consider the type `(∀ b. f b)` for a functor `f.` Does it have a simpler isomorphic form?

The Yoneda lemma applies to types of the form `∀ b. (a → b) → f b` for some type `a`. We don’t have a function to `f b`, but we could fake one by introducing a unit argument:

``````  ∀ b. f b
≅ ∀ b. 1 → f b
``````

Warning: this step is only correct for strict functions, so I’m fudging here.

I’m using “`1`” for the Haskell type `()`. I’ll also use “×” and “+” for product and sum types in these examples, in place of the Haskell names “`(,)`” and “`Either`“. For parsing the examples, × binds more tightly than than +, which binds more tightly than →.

We haven’t gotten to the required form yet, because we have `1` where we need `a → b` for some `a`. Hm. For what type `a` is there only one function from `a` to `b`, where `b` is arbitrary type? By choosing `b` to have at least two elements, we can make different functions of type `a → b` by mapping some element of `a` to different elements of `b`. Unless `a` has no elements at all!

Let `0` be the type with no elements:

``````data 0
``````

This type is usually named “`Void`” in Haskell.

Since there are no values of type `0`, there is exactly one function from `0` to any other type `b`, i.e., `1 ≅ 0 → b`. (This type isomorphism corresponds to a familiar equality on numbers.) Thus continuing our definition from above

``````⋯ ≅ ∀ b. (0 → b) → f b
``````

Now we have the form covered directly by the Yoneda lemma

``````⋯ ≅ f 0
``````

Oh dear, I keep lying to you. The step from `1` to `0 → b` is also valid only for strict functions, or for a `0` without ⊥.

Replaying this derivation,

``````  ∀ b. f b
≅ ∀ b. 1 → f b
≅ ∀ b. (0 → b) → f b
≅ f 0
``````

Next try a simpler example: `∀ a. a`. This time, temporarily introduce the identity functor `Id`:

``````  ∀ a. a
≅ ∀ a. Id a
≅ Id 0
≅ 0
``````

which is correct, as there is no non-⊥ value of type `∀ a. a`.

Next try `∀ a. a → a`:

``````  ∀ a. a → a
≅ ∀ a. (1 → a) → Id a
≅ Id 1
≅ 1
``````

And indeed, there is only one (non-⊥) function of type `∀ a. a → a`.

Here’s a simple specialization of the Yoneda lemma (also mentioned in Dan’s post):

``````  ∀ b. (a → b) → b
≅ ∀ b. (a → b) → Id b
≅ Id a
≅ a
``````

These examples prove the existence of bijection, but we can also synthesize the bijections, as Dan Piponi demonstrates. The bijections then let us memoize.

### Memoizing polymorphic functions via unmemoization

Let’s look at the Yoneda lemma again. Recall

``````type Yo f a = ∀ b. (a → b) → f b
``````

For all natural transformations `f` (in Haskell polymorphic functions “with no funny stuff”), the Yoneda lemma says that

``````Yo f a ≅ f a
``````

where the conversions between `Yo f a` and `f a` are very simple: `(\$ id)` and `flip fmap`.

Now suppose we want to memoize a polymorphic function `∀ a. T a → f a`. The Yoneda lemma suggests that we find some `Q` such that `T a ≅ Q → a`, in which case

``````  ∀ a. T a → f a
≅ ∀ a. (Q → a) → f a
≅ f Q
``````

How do `Q` and `T` relate? `T` is a memo table for `Q`. In other words, `Q` is the type of indices into `T`. In still other words, `Q` is an unmemoization of `T`.

This trick is from Dan Piponi’s post Memoizing Polymorphic Functions with High School Algebra and Quantifiers. We started with a polymorphic function, and we ended up with a data structure, i.e., we have memoized. But the critical step one the route to memoization was unmemoization (of `T`). Delightful!

We saw some very simple examples in the previous section. Let’s next look at some more examples.

#### Pairs

How can we memoize polymorphic functions on pairs, i.e., functions of type `∀ a. a × a → f a`?

We can work out an answer by appealing to the same laws of exponents used in memoization, but now applied in reverse (to create function types instead of eliminate them).

``````  a × a
≅ (1 → a) × (1 → a)
≅ (1 + 1) → a
≅ Bool → a
``````

Again, I’m handling only strict functions here.

Therefore,

``````(∀ a. a × a → f a)  ≅  f Bool
``````

Spelling out this conclusion (once more for practice),

``````  ∀ a. a × a      → f a
≅ ∀ a. (Bool → a) → f a
≅ f Bool
``````

A variation on this derivation is to make the pairing functor explicit as

``````data P a = P a a
``````

or, using functor combinators,

``````type P = Id :*: Id
``````

Now imagine we have a type family `UnTrie` that acts as an inverse to `Trie`

``````type UnTrie P = UnTrie (Id :*: Id)
= UnTrie Id + UnTrie Id
= 1 + 1
≅ Bool
``````

#### Streams

A type of streams (infinite-only lazy lists) is like the type of lists but without a nil case:

``````data Stream a = Cons a (Stream a)
``````

How can we memoize `∀ a. Stream a → f a`? We simply ask: for what type is `Stream` a memo table? Or: what type naturally indexes a stream? An answer is natural numbers in Peano form:

``````data Nat = Zero | Succ Nat
``````

Memoizing `Nat → a` as in Elegant memoization with higher-order types leads to `Stream a`.

Therefore,

``````∀ a. Stream a → f a  ≅  f Nat
``````

In case we didn’t think of `Nat` right off, we can derive it systematically. We’ll appeal to the same three laws of exponents used in memoization, but now applied in reverse (to create function types instead of eliminate them).

Look for a type `N` such that `N → a ≅ Stream a`.

``````  N → a
≅ Stream a
≅ a × Stream a
≅ (1 → a) × (N → a)   --  coinductively
≅ (1 + N) → a
``````

A sufficient condition then is that `N ≅ 1 + N`, which condition then translates to a recursive data type definition:

``````data N = Z | S N
``````

or

``````data Nat = Zero | Succ Nat
``````

#### Infinite binary trees

Next let’s look at one form of binary trees, having values at each node and only infinite paths.

``````data BTree a = BTree a (BTree a) (BTree a)
``````

and let `B` be an unmemoized form of `BTree`, or type of indices of `BTree`, i.e., `B → a ≅ BTree a`.

``````  B → a
≅ BTree a
≅ a × BTree a × BTree a
≅ (1 → a) × (B → a) × (B → a)
≅ (1 + B + B) → a
``````

A sufficient definition for `B` is then `B = 1 + B + B`. Or in legal Haskell form (no recursive type synonyms):

``````data B = Z | E B | O B
``````

We can think of this data type as representing natural numbers in little endian binary. Expanding the type and constructor names:

``````data BinNat = Zero | Even BinNat | Odd BinNat
``````

#### A variation on infinite binary trees

We might have written our tree type differently:

``````data BTree a = BTree a (P (BTree a))
``````

where `P` is the pairing functor. Again, let `B` be the sought index type for (unmemoization of) `BTree`

``````  B → a
≅ BTree a
≅ a × P (BTree a)
≅ (1 → a) × (Bool → BTree a)
≅ (1 → a) × (Bool → B → a)     ­- coinductively
≅ (1 → a) × (Bool × B → a)
≅ 1 + Bool × B → a
``````

which then leads to a slightly different representation for our index type:

``````data BinNat = Zero | NonZero Bool BinNat
``````

We’ve made the bit type explicit. In this form, `BinNat` is a little-endian list of bits, so we can redefine it as such:

``````type BinNat = [Bool]
``````

and the desired isomorphism

``````(∀ a. BTree a → f a)  ≅  f [Bool]
``````

#### Generalized infinite trees

The functor `P` and the type `Bool` have an important relationship to each other in the previous derivation, namely `Bool` is the index type for `P`. We can play this same trick for all index types and their corresponding trie (memoization) functors. Generalize from binary trees:

``````data Tree d a = Tree a (d :→: Tree d a)
``````

where “`k :→: v`” is the type of memo tries over domain type `k`, with range `v`. (The trie structure is driven entirely by `k`.) See other posts on memoization.

These generalized trees are indexed by little-endian natural numbers over a “digit” type `d`. Generalizing the proof for binary trees we get

``````Tree d a ≅ [d] → a
``````

Details:

``````  Tree d a
≅ a × (d :→: Tree d a)
≅ a × (d → Tree d a)
≅ a × (d → [d] → a)   ­- coinductively
≅ a × (d × [d] → a)
≅ (1 → a) × (d × [d] → a)
≅ (1 + d × [d] → a)
≅ [d] → a
``````

So the Yoneda lemma tells us that

``````(∀ a. Tree d a → f a) ≅ f [d]
``````

Specializing, we get binary trees indexed by binary numbers, and streams (“unary trees”) indexed by unary numbers:

``````type BTree = Tree Bool
type Stream = Tree 1
``````

For rose trees we would need an index type `d` whose memo tries are lists, i.e., `d :→: v ≅ [v]` for all types `v`. I don’t think there is such a type `d`, since `[v]` is isomorphic to a `sum` type, and memoization–like exponentiation­-doesn’t produce sums. I’ll return to this question below.

#### `Maybe`

Let’s try something a bit simpler: memoizing `∀ a. Maybe a → f a`. Of what domain type is `Maybe a` a memo table?

Oops. None of the memoizing transformations generates a sum type such as `Maybe`. So we’ll have to try something else. First split the domain sum, yielding a product, and then proceed with the summands.

``````  ∀ a. Maybe a → f a
≅ ∀ a. (1 + a) → f a
≅ ∀ a. (1 → f a) × (a → f a)
≅ ∀ a. f a × (a → f a)
≅ (∀ a. f a) × (∀ a. a → f a)
≅ f 0 × f 1
``````

#### Lists

(Even more than the previous sections, this one is a regurgitation of material from Dan’s post.)

We’ve covered `Maybe` and `Stream` above, so lists might work out with a combination of techniques.

``````  ∀ a. [a] → f a
≅ ∀ a. (1 + a × [a]) → f a
≅ ∀ a. (1 → f a) × (a × [a] → f a)
≅ ∀ a. f a × (a × [a] → f a)
≅ (∀ a. f a) × (∀ a. a × [a] → f a)
≅ f 0 × (∀ a. a × [a] → f a)
``````

How to continue from here? The remaining polymorphic function doesn’t look quite like what we started with. Dan’s trick was to generalize (as Pólya recommended), replacing `[a]` with `a^n × [a]`, starting the process with `n ≡ 0`. Dan generalized even more, where `n` is an arbitrary type, to `(n → a) × [a]`, starting with `n ≡ 0` (zero).

``````type T f n = ∀ a. (n → a) × [a] → f a
``````

Then

``````  T f n
≡ ∀ a. (n → a) × [a] → f a
≅ ∀ a. (n → a) × (1 + a × [a]) → f a
≅ ∀ a. (n → a) × 1 + (n → a) × a × [a] → f a
≅ ∀ a. (n → a) + (n → a) × a × [a] → f a
≅ ∀ a. (n → a) + (n → a) × (1 → a) × [a] → f a
≅ ∀ a. (n → a) + ((n + 1 → a) × [a] → f a
≅ ∀ a. ((n → a) → f a) × ((n + 1 → a) × [a] → f a)
≅ (∀ a. (n → a) → f a) × (∀ a. (n + 1 → a) × [a] → f a)
≅ f n × (∀ a. (n + 1 → a) × [a] → f a)
≡ f n × T f (n + 1)
``````

Inductively, this isomorphism gives rise to an function-free type:

``````type Q f n = f n × Q f (n + 1)
``````

for which

``````Q f n ≅ T f n
``````

Since Haskell doesn’t handle recursive type synonyms, instead define a new `data` type, as in Dan’s post:

``````data Q f n = Q (f n) (Q f (n + 1))
``````

Since `n + 1 ≅ Maybe n`, we could instead define

``````data Q f n = Q (f n) (Q f (Maybe n))
``````

As Dan described, the trick in this derivation is to rework the type `[a]` in polynomial form `[a] ≅ 1 + a + a^2 + a^3 + ⋯`. Then memoization works out fairly easily:

``````(∀ a. [a] → f a) ≅ f 0 × f 1 × f 2 × ...
``````

Where the types `0`, `1`, `2`, … have 0, 1, 2, … elements, respectively. We can choose `2 = Maybe 1`, `3 = Maybe 2`, … or `2 = 1 + 1`, `3 = 2 + 1`, ….

### Where to go from here?

#### Algebraic types

I’m not entirely satisfied with this solution to memoizing polymorphic functions over lists (and other algebraic types). I want to find another angle that doesn’t require this conversion to polynomial form. Partly because I don’t think it handles infinite lists (as Dan mentioned). I suppose one could just add the infinite case:

``````(∀ a. [a] → f a) ≅ (f 0 × f 1 × f 2 × ...) × f Nat ≅ Q f n × f Nat
``````

How to justify/derive this isomorphism? And how to get non-strict memoization, including sharing of work for computations that use bounded input?

I also want a representation that’s friendly to non-strict memoization, sharing work on common prefixes.

#### Constructing isomorphisms

As always, type isomorphism is at the core of functional memoization. However, it’s not enough to show that isomorphisms exist. We have to construct the isomorphisms in order to implement a harder memoization problem via an easier one.

Dan’s post builds up some isomorphisms. The process is rather tedious, and I’d like to make it much easier. I’ve tinkered with making this process much more elegant by applying Semantic editor combinators in their generalized form (as in the paper Tangible Functional Programming), using bijections as the deep arrow. So far, however, I’m struggling unsuccessfully against awkwardness in the handling of explicit polymorphism (higher-rank types).

#### Automation

In my other posts on memoization, functions are memoized in a fully automatic, type-driven way. Can the same be done somehow with polymorphic functions?

#### Non-strictness

I fudged treatment of non-strictness above. Can non-strict polymorphic functions be handled correctly and elegantly, perhaps with a simple combination of previous techniques or perhaps with new ones?

#### Other memoization tools?

Dan’s use of the Yoneda lemma to memoize polymorphic functions (via unmemoization) came as a surprise to me. Are there other deep properties of typed, pure functional programming that give rise to additional tools for memoization? One place to start looking is additional free theorems.

#### Relationship to numeric representations

In Purely Functional Data Structures, Chris Okasaki points out a correspondence between number representation schemes and collection data structures. How does this correspondence relate to memoization?

1. #### Dan P:

Nice post. I like your nicely laid out proofs.

When I wrote my blog article, I thought infinite lists couldn’t possibly be memoized. My argument was that in order to know you were looking at the right entry in the memo table you need equality on infinite lists. But equality on infinite lists isn’t computable.

However, that kind of informal reasoning tells you that this is impossible: http://math.andrej.com/2007/09/28/seemingly-impossible-functional-programs/

So now I’m not sure if infinite lists aren’t memoizable. From a topological perspective, the set of infinite lists are compact (in the appropriate topology) which means that they share many useful properties with finite sets. Memoizability may be one.

Seems like an nice problem to consider.

2. #### Peter H:

Dan, the set of infinite lists (Stream a) is compact (if and) only if a is finite. Cantor space 2^w is compact. Baire space w^w isn’t. I don’t think Escardo’s algorithm can extended to work on Baire space.

3. #### conal:

Dan Piponi wrote:

When I wrote my blog article, I thought infinite lists couldn’t possibly be memoized. My argument was that in order to know you were looking at the right entry in the memo table you need equality on infinite lists.

Worse yet, (as you already know) in type-driven functional memoization, we don’t need to compare domain values for equality. Rather, we decode those values as paths into a type-based memo structure. For infinitely large domain values, we must then generate and follow infinite paths.

However, I think my posts on Non-strict memoization solve this problem, and can make memoization more efficient even with finite lists. Don’t look (possibly forever) for “the right entry”. Instead, given a (possibly infinite) domain value, combine the information (using `lub`) associated with a chain of approximations leading up to the domain value in the limit.

For instance, consider memoizing the function that adds the first 20 elements of a list: `f = sum . take 20`. Using non-strict memoization, the work for all lists with the same first 20 elements can be shared, even for infinite lists.

4. #### Dan P:

For infinitely large domain values, we must then generate and follow infinite paths.

Actually, that’s not necessarily the case. A total function p of type Stream Bool -> Bool can only examine a finite number of elements of any stream argument it is given. Konig’s lemma tells us that for any p we can find a single finite number that bounds the number of stream elements that are ever examined, no matter what stream argument is given. This means we can use a finite trie for memoisation without doing lub tricks.

I’d post the code but don’t know what markup to use.

I’ll be checking out your lub posts in more detail. They answer some questions that have been nagging me for a while.

5. #### conal:

Hi Dan. Try surrounding your code by a pair of html `pre` tags. The preview button ought to show you how it comes out.

I don’t know what prevents the usual Markdown formatting from working in comments. More strangely yet, they do work for me but not for other folks.

6. #### Dan P:

I had written a lazier version that folded memo, berger and isConst into one function but it’s buggy so I reverted to my first iteration. I expect it’s fixable. So don’t see this as a good version. Just a proof of what’s possible.

```type Nat = Integer

(x # f) i = if i==0 then x else f (i-1)

-- Borrowed from http://www.cs.bham.ac.uk/~mhe/papers/exhaustive.hs
berger :: ((Nat -> Bool) -> Bool) -> (Nat -> Bool)
berger p = let b0 = berger (f -> p (False # f))
b1 = berger (f -> p (True # f))
in if p (False # b0) then False # b0 else True # b1

-- As if by magic, isConst tests whether its argument is the
-- constant function.
isConst :: ((Nat -> Bool) -> Bool) -> Bool
isConst p = let q f = p f /= pf
pf = p (const False)
in not (q (berger q))

data Tree = Tree Tree Tree | Leaf Bool deriving Show

unmemo :: Tree -> (Nat -> Bool) -> Bool
unmemo (Leaf x) a = x
unmemo (Tree x y) a = unmemo (if a 0 then y else x) (a . (1+))

memo :: ((Nat -> Bool) -> Bool) -> Tree
memo p = if isConst p
then Leaf (p (const False))
else Tree (memo (p . (False #))) (memo (p . (True #)))

-- id' converts a predicate on Boolean streams to a memoised version

id' :: ((Nat -> Bool) -> Bool) -> ((Nat -> Bool) -> Bool)
id' = unmemo . memo
```
7. #### Conal Elliott » Blog Archive » Fixing lists:

[…] Memoizing polymorphic functions via unmemoization […]

8. #### Conal Elliott » Blog Archive » Type-bounded numbers:

[…] Memoizing polymorphic functions via unmemoization […]

9. #### Conal Elliott » Blog Archive » Reverse-engineering length-typed vectors:

[…] Memoizing polymorphic functions via unmemoization […]

10. #### Conal Elliott » Blog Archive » From tries to trees:

[…] Memoizing polymorphic functions via unmemoization […]

11. #### Edward Kmett:

BTW- What you call an “unmemoization” is more often consisered a representation of a representable functor by the categorically inclined. Representable functors are isomorphic to (a -> x) for some choice of a. Every right adjoint is representable by its left adjoint applied to unit for instance. You may want to have a look at my representable-functors, adjunctions and representable-tries packages on hackage.