Memoizing polymorphic functions – part one
Memoization takes a function and gives back a semantically equivalent function that reuses rather than recomputes when applied to the same argument more than once. Variations include not-quite-equivalence due to added strictness, and replacing value equality with pointer equality.
Memoization is often packaged up polymorphically:
memo :: (???) => (k -> v) -> (k -> v)
For pointer-based (“lazy”) memoization, the type constraint (“???”) is empty.
For equality-based memoization, we’d need at least Eq k
, and probably Ord k
or HasTrie k
for efficient lookup (in a finite map or a possibly infinite memo trie).
Although memo
is polymorphic, its argument is a monomorphic function.
Implementations that use maps or tries exploit that monomorphism in that they use a type like Map k v
or Trie k v
.
Each map or trie is built around a particular (monomorphic) type of keys.
That is, a single map or trie does not mix keys of different types.
Now I find myself wanting to memoize polymorphic functions, and I don’t know how to do it.
Flavors of polymorphism
If a recursively defined function f
is polymorphic, then the recursion may still be monomorphic.
That is, recursive calls may be restricted to the same type instance as the parent call.
Most recursive polymorphic functions fit this form, because most polymorphic recursive data types are “regular”, meaning that a polymorphic data type is included in itself only at the same type instance.
For instance, the usual polymorphic lists and trees are regular.
Example: GADTs
Among other places, non-regular, or “nested data types” arise in statically typed encodings of typed languages. For instance, here’s a GADT (generalized algebraic data type) for typed lambda calculus expressions:
-- Variables
data V a = V String (Type a)
-- Expressions
data E :: * -> * where
Lit :: a -> E a -- literal
Var :: V a -> E a -- variable
(:^) :: E (a -> b) -> E a -> E b -- application
Lam :: V a -> E b -> E (a -> b) -- abstraction
The Type
type is sort of like TypeRep
, except that it is statically typed.
data Type :: * -> * where
Bool :: Type Bool
Float :: Type Float
...
(:*:) :: Type a -> Type b -> Type (a,b)
(:->:) :: Type a -> Type b -> Type (a->b)
These GADTs (E
and Type
) are both non-regular, and so recursive functions over them will involve more than one argument type.
So how can we memoize?
A first try
Let’s consider a specific case of polymorphic functions:
type k :--> v = ∀ a. HasType a => k a -> v a
HasType
is to Typeable
as Type
is to TypeRep
.
The memo
implementation I’m playing with relies on HasType
.
The memoizer can have type
memo :: (k :--> v) -> (k :--> v)
which uses rank 2 polymorphism (because of the argument type’s ∀
, which cannot be moved to the outside).
My implementation of memo
is similar to the discussion in Stretching the storage manager: weak pointers and stable names in Haskell, but modernized a bit and adapted for polymorphism.
It uses an IntMap
of lists of pairs of stable names (akin to pointers) for arguments and values for results.
The idea is to first use hashStableName
to get an Int
to use as an IntMap
key, and then linearly traverse the resulting list of binding pairs, comparing stable keys for equality.
(StableName
has Eq
but not Ord
.)
Although hashStableName
can map different stable names to the same hash value, collisions are rare, so the StableBind
lists rarely have more than one element and the linear search is cheap.
type SM k v = I.IntMap [StableBind k v] -- Stable map
data StableBind k v = ∀ a. HasType a => SB (StableName (k a)) (v a)
The reason for hiding the type parameter a
in StableName
is so that different bindings can be for different types.
The key tricky bit is managing static typing while searching for a particular StableName a
in a [StableBind]
.
Here’s my implementation:
blookup :: ∀ k v a. HasType a =>
StableName (k a) -> [StableBind k v] -> Maybe (v a)
blookup stk = look
where
look :: [StableBind k v] -> Maybe (v a)
look [] = Nothing
look (SB stk' v : binds')
| Just Refl <- tya `tyEq` typeOf2 stk', stk == stk' = Just v
| otherwise = look binds'
tya :: Type a
tya = typeT
The crucial magic bit is
tyEq :: Type a -> Type b -> Maybe (a :=: b)
where a :=: b
represents a proof that the types a
and b
are the same type.
The proof type has a simple GADT representation:
data (:=:) :: * -> * -> * where Refl :: a :=: a
This simple type definition ensures that only valid type-equality proofs can exist.
Well, except for ⊥.
The guard’s pattern match with Just Refl
will force evaluation, so that ⊥ can’t sneak by us.
That match also informs the type-checker that stk
and stk'
are stable names of the same type in that clause, which then makes stk == stk'
be well-typed, and makes Just v
have the required type, i.e., Maybe (v a)
.
Finally, the typeOf2
function is a simple helper that peels off two type constructors and extracts a Type
:
typeOf2 :: HasType a => g (f a) -> Type a
Wishing for more
The type of memo
above is too restrictive for my uses.
It only handles polymorphic functions of type k a -> v a
, and only with the single constraint HasType a
.
The reason I want lazy memoization now is that I’m compressing expressions to maximize representation sharing, as John Hughes described in Lazy Memo-functions. Once sharing is maximized, pointer-based memoization works better, because equal values are pointer-equal. To compress an expression, simply use a memoized copy function, as John suggested.
compress :: HasType a => E a -> E a
compress e = mcopy e
where
mcopy, copy :: HasType b => E b -> E b
-- Memo version
mcopy = memo copy
-- Copier, with memo-copied components
copy (u :^ v) = appM (mcopy u) (mcopy v)
copy (Lam v b) = lamM v (mcopy b)
copy e = e
-- memoized constructors
appM :: (HasType a, HasType b) => E (a -> b) -> E a -> E b
appM = memo2 (:^)
lamM :: (HasType a, HasType b) => V a -> E b -> E (a -> b)
lamM = memo2 Lam
The memo2
function is defined in terms of memo
, using some some newtype
trickery.
Its type:
memo2 :: HasType a =>
(k a -> l a -> v a) -> (k a -> l a -> v a)
But, sigh, the (:^)
and Lam
constructors I’m trying to memoize do not have the required types.
My higher-order-polymorphic memo functions do not have flexible enough types.
And this is where I’m stuck.
I’d appreciate your ideas and suggestions.
Conal Elliott » Blog Archive » Memoizing polymorphic functions - part two:
[…] About « Memoizing polymorphic functions – part one […]
12 June 2009, 2:04 pmConal Elliott » Blog Archive » Memoizing polymorphic functions via unmemoization:
[…] year I wrote two posts on memoizing polymorphic functions. The first post gave a solution of sorts that uses a combination of patricia trees (for integer-keyed maps), stable […]
10 January 2011, 4:00 pm