## Memoizing polymorphic functions – part one

Memoization takes a function and gives back a semantically equivalent function that reuses rather than recomputes when applied to the same argument more than once. Variations include not-quite-equivalence due to added strictness, and replacing value equality with pointer equality.

Memoization is often packaged up polymorphically:

``````memo :: (???) => (k -> v) -> (k -> v)
``````

For pointer-based (“lazy”) memoization, the type constraint (“???”) is empty. For equality-based memoization, we’d need at least `Eq k`, and probably `Ord k` or `HasTrie k` for efficient lookup (in a finite map or a possibly infinite memo trie).

Although `memo` is polymorphic, its argument is a monomorphic function. Implementations that use maps or tries exploit that monomorphism in that they use a type like `Map k v` or `Trie k v`. Each map or trie is built around a particular (monomorphic) type of keys. That is, a single map or trie does not mix keys of different types.

Now I find myself wanting to memoize polymorphic functions, and I don’t know how to do it.

### Flavors of polymorphism

If a recursively defined function `f` is polymorphic, then the recursion may still be monomorphic. That is, recursive calls may be restricted to the same type instance as the parent call. Most recursive polymorphic functions fit this form, because most polymorphic recursive data types are “regular”, meaning that a polymorphic data type is included in itself only at the same type instance. For instance, the usual polymorphic lists and trees are regular.

Among other places, non-regular, or “nested data types” arise in statically typed encodings of typed languages. For instance, here’s a GADT (generalized algebraic data type) for typed lambda calculus expressions:

``````-- Variables
data V a = V String (Type a)

-- Expressions
data E :: * -> * where
Lit  :: a -> E a                      -- literal
Var  :: V  a -> E a                   -- variable
(:^) :: E (a -> b) -> E a -> E b      -- application
Lam  :: V a -> E b -> E (a -> b)      -- abstraction
``````

The `Type` type is sort of like `TypeRep`, except that it is statically typed.

``````data Type :: * -> * where
Bool   :: Type Bool
Float  :: Type Float
...
(:*:)  :: Type a -> Type b -> Type (a,b)
(:->:) :: Type a -> Type b -> Type (a->b)
``````

These GADTs (`E` and `Type`) are both non-regular, and so recursive functions over them will involve more than one argument type.

So how can we memoize?

### A first try

Let’s consider a specific case of polymorphic functions:

``````type k :--> v = ∀ a. HasType a => k a -> v a
``````

`HasType` is to `Typeable` as `Type` is to `TypeRep`. The `memo` implementation I’m playing with relies on `HasType`.

The memoizer can have type

``````memo :: (k :--> v) -> (k :--> v)
``````

which uses rank 2 polymorphism (because of the argument type’s `∀`, which cannot be moved to the outside).

My implementation of `memo` is similar to the discussion in Stretching the storage manager: weak pointers and stable names in Haskell, but modernized a bit and adapted for polymorphism. It uses an `IntMap` of lists of pairs of stable names (akin to pointers) for arguments and values for results. The idea is to first use `hashStableName` to get an `Int` to use as an `IntMap` key, and then linearly traverse the resulting list of binding pairs, comparing stable keys for equality. (`StableName` has `Eq` but not `Ord`.) Although `hashStableName` can map different stable names to the same hash value, collisions are rare, so the `StableBind` lists rarely have more than one element and the linear search is cheap.

``````type SM k v = I.IntMap [StableBind k v]  -- Stable map

data StableBind k v = ∀ a. HasType a => SB (StableName (k a)) (v a)
``````

The reason for hiding the type parameter `a` in `StableName` is so that different bindings can be for different types.

The key tricky bit is managing static typing while searching for a particular `StableName a` in a `[StableBind]`. Here’s my implementation:

``````blookup :: ∀ k v a. HasType a =>
StableName (k a) -> [StableBind k v] -> Maybe (v a)
blookup stk = look
where
look :: [StableBind k v] -> Maybe (v a)
look [] = Nothing
look (SB stk' v : binds')
| Just Refl <- tya `tyEq` typeOf2 stk', stk == stk' = Just v
| otherwise                                         = look binds'
tya :: Type a
tya = typeT
``````

The crucial magic bit is

``````tyEq :: Type a -> Type b -> Maybe (a :=: b)
``````

where `a :=: b` represents a proof that the types `a` and `b` are the same type. The proof type has a simple GADT representation:

``````data (:=:) :: * -> * -> * where Refl :: a :=: a
``````

This simple type definition ensures that only valid type-equality proofs can exist. Well, except for ⊥. The guard’s pattern match with `Just Refl` will force evaluation, so that ⊥ can’t sneak by us. That match also informs the type-checker that `stk` and `stk'` are stable names of the same type in that clause, which then makes `stk == stk'` be well-typed, and makes `Just v` have the required type, i.e., `Maybe (v a)`.

Finally, the `typeOf2` function is a simple helper that peels off two type constructors and extracts a `Type`:

``````typeOf2 :: HasType a => g (f a) -> Type a
``````

### Wishing for more

The type of `memo` above is too restrictive for my uses. It only handles polymorphic functions of type `k a -> v a`, and only with the single constraint `HasType a`.

The reason I want lazy memoization now is that I’m compressing expressions to maximize representation sharing, as John Hughes described in Lazy Memo-functions. Once sharing is maximized, pointer-based memoization works better, because equal values are pointer-equal. To compress an expression, simply use a memoized copy function, as John suggested.

``````compress :: HasType a => E a -> E a
compress e = mcopy e
where
mcopy, copy :: HasType b => E b -> E b
-- Memo version
mcopy = memo copy
-- Copier, with memo-copied components
copy (u :^ v)  = appM (mcopy u) (mcopy v)
copy (Lam v b) = lamM v (mcopy b)
copy e         = e
-- memoized constructors
appM :: (HasType a, HasType b) => E (a -> b) -> E a -> E b
appM = memo2 (:^)
lamM :: (HasType a, HasType b) => V a -> E b -> E (a -> b)
lamM = memo2 Lam
``````

The `memo2` function is defined in terms of `memo`, using some some `newtype` trickery. Its type:

``````memo2 :: HasType a =>
(k a -> l a -> v a) -> (k a -> l a -> v a)
``````

But, sigh, the `(:^)` and `Lam` constructors I’m trying to memoize do not have the required types. My higher-order-polymorphic memo functions do not have flexible enough types.

And this is where I’m stuck.

I’d appreciate your ideas and suggestions.