Memoizing higher-order functions

Memoization incrementally converts functions into data structures. It pays off when a function is repeatedly applied to the same arguments and applying the function is more expensive than accessing the corresponding data structure.

In lazy functional memoization, the conversion from function to data structure happens all at once from a denotational perspective, and incrementally from an operational perspective. See Elegant memoization with functional memo tries and Elegant memoization with higher-order types.

As Ralf Hinze presented in Generalizing Generalized Tries, trie-based memoization follows from three simple isomorphisms involving functions types:

1 → a ≅ a

(a + b) → c ≅ (a → c) × (b → c)

(a × b) → c ≅ a → (b → c)

which correspond to the familiar laws of exponents

a ^ 1 = a

ca + b = ca × cb

ca × b = (cb)a

When applied as a transformation from left to right, each law simplifies the domain part of a function type. Repeated application of the rules then eliminate all function types or reduce them to functions of atomic types. These atomic domains are eliminated as well by additional mappings, such as between a natural number and a list of bits (as in patricia trees). Algebraic data types corresponding to sums of products and so are eliminated by the sum and product rules. Recursive algebraic data types (lists, trees, etc) give rise to correspondingly recursive trie types.

So, with a few simple and familiar rules, we can memoize functions over an infinite variety of common types. Have we missed any?

Yes. What about functions over functions?

Edits:

  • 2010-07-22: Made the memoization example polymorphic and switched from pairs to lists. The old example accidentally coincided with a specialized version of trie itself.
  • 2011-02-27: updated some notation

Tries

In Elegant memoization with higher-order types, I showed a formulation of functional memoization using functor combinators.

type k ↛ v = Trie k v
class HasTrie k where
type Trie k **
trie (k → v) → (k ↛ v)
untrie (k ↛ v) → (k → v)

I will describe higher-order memoization in terms of this formulation. I imagine it would also work out, though less elegantly, in the associated data types formulation described in Elegant memoization with functional memo tries.

Domain isomorphisms

Elegant memoization with higher-order types showed how to define a HasTrie instance in terms of the instance of an isomorphic type, e.g., reducing tuples to nested pairs or booleans to a sum of unit types. A C macro, HasTrieIsomorph encapsulates the domain isomorphism technique. For instance, to reduce triples to pairs:

HasTrieIsomorph( (HasTrie a, HasTrie b, HasTrie c), (a,b,c), ((a,b),c)
, λ (a,b,c) → ((a,b),c), λ ((a,b),c) → (a,b,c))

This isomorphism technique applies as well to the standard functor combinators used for constructing tries (any many other purposes). Those combinators again:

data Const x a = Const x
data Id a = Id a
data (f × g) a = f a × g a
data (f + g) a = InL (f a) | InR (g a)
newtype (g ∘ f) a = O (g (f a))

and their trie definitions:

HasTrieIsomorph( HasTrie a, Const a x, a, getConst, Const )
HasTrieIsomorph( HasTrie a, Id a, a, unId, Id )
HasTrieIsomorph( (HasTrie (f a), HasTrie (g a))
, (f × g) a, (f a,g a)
, λ (fa × ga) → (fa,ga), λ (fa,ga) → (fa × ga) )
HasTrieIsomorph( (HasTrie (f a), HasTrie (g a))
, (f + g) a, Either (f a) (g a)
, eitherF Left Right, either InL InR )
HasTrieIsomorph( HasTrie (g (f a))
, (g ∘ f) a, g (f a) , unO, O )

The eitherF function is a variation on either:

eitherF  (f a → b) → (g a → b) → (f + g) a → b
eitherF p _ (InL fa) = p fa
eitherF _ q (InR ga) = q ga

Higher-order memoization

Now higher-order memoization is easy. Apply yet another isomorphism, this time between functions and tries: The trie and untrie methods are exactly the mappings we need.

HasTrieIsomorph( (HasTrie a, HasTrie (a ↛ b))
, a → b, a ↛ b, trie, untrie)

So, to memoize a higher-order function f ∷ (a → b) → v, we only a trie type for a and one for a ↛ b. The latter (tries for trie-valued domains) are provided by the isomorphisms above, and additional ones.

Demo

Our sample higher-order function will take a function of booleans and yield its value at False and at True:

ft1  (Bool → a) → [a]
ft1 f = [f False, f True]

A sample input converts False to 0 and True to 1:

f1  BoolInt
f1 False = 0
f1 True = 1

A sample run without memoization:

*FunctorCombo.MemoTrie> ft1 f1
[0,1]

and one with memoization:

*FunctorCombo.MemoTrie> memo ft1 f1
[0,1]

To illustrate what's going on behind the scenes, the following definitions (all of which type-check) progressively reveal the representation of the underlying memo trie. Most steps result from inlining a single Trie definition (as well as switching between Trie k v and the synonymous form k ↛ v).

trie1a  HasTrie a ⇒ (Bool → a) ↛ (a, a)
trie1a = trie ft1

trie1b HasTrie a ⇒ (Bool ↛ a) ↛ (a, a)
trie1b = trie1a

trie1c HasTrie a ⇒ (Either () () ↛ a) ↛ (a, a)
trie1c = trie1a

trie1d HasTrie a ⇒ ((Trie () × Trie ()) a) ↛ (a, a)
trie1d = trie1a

trie1e HasTrie a ⇒ (Trie () a, Trie () a) ↛ (a, a)
trie1e = trie1a

trie1f HasTrie a ⇒ (() ↛ a, () ↛ a) ↛ (a, a)
trie1f = trie1a

trie1g HasTrie a ⇒ (a, a) ↛ (a, a)
trie1g = trie1a

trie1h HasTrie a ⇒ (Trie a ∘ Trie a) (a, a)
trie1h = trie1a

trie1i HasTrie a ⇒ a ↛ a ↛ (a, a)
trie1i = unO trie1a

Pragmatics

I'm happy with the correctness and elegance of the method in this post. It gives me the feeling of inevitable simplicity that I strive for -- obvious in hindsight. What about performance? After all, memoization is motivated by a desire to efficiency -- specifically, to reduce the cost of repeatedly applying the same function to the same argument, while keeping almost all of the modularity & simplicity of a naïve algorithm.

Memoization pays off when (a) a function is repeatedly applied to some arguments, and (b) when the cost of recomputing an application exceeds the cost of finding the previously computed result. (I'm over-simplifying here. Space efficiency matters also and can affect time efficiency.) The isomorphism technique used in this post and a previous one requires transforming an argument to the isomorphic type for each look-up and from the isomorphic type for each application. (I'm using "isomorphic type" to mean the type for which a HasTrie instance is already defined.) When these transformations are between function and trie form, I wonder how high the break-even threshold becomes.

How might we avoid these transformations, thus reducing the overhead of memoizing?

For conversion to isomorphic type during trie lookup, perhaps the cost could be reduced substantially through deforestation--inlining chains of untrie methods and applying optimizations to eliminate the many intermediate representation layers. GHC has gotten awfully good at this sort of thing. Maybe someone with more Haskell performance analysis & optimization experience than I have would be interested in collaborating.

For trie construction, I suspect the conversion back from the isomorphic type could be avoided by somehow holding onto the original form of the argument, before it was converted to the isomorphic type. I haven't attempted this idea yet.

Another angle on reducing the cost of the isomorphism technique is to use memoization! After all, if memoizing is worthwhile at all, there will be repeated applications of the memoized function to the same arguments. Exactly in such a case, the conversion of arguments to isomorphic form will also be done repeatedly for these same arguments. When a conversion is both expensive and repeated, we'd like to memoize. I don't know how to get off the ground with this idea, however. If I'm trying to memoize a function of type a → b, then the required conversion has type a → a' for some type a' with a HasTrie instance. Memoizing that conversion is just as hard as memoizing the function we started with.

Conclusion

Existing accounts of functional memoization I know of cover functions of the unit type, sums, and products, and they do so quite elegantly.

Type isomorphisms form the consistent, central theme in this work. Functions from unit, sums and products have isomorphic forms with simpler domain types (and so on, recursively). Additional isomorphisms extend these fundamental building blocks to many other types, including integer types and algebraic data types. However, functions over function-valued domains are conspicuously missing (though I hadn't noticed until recently). This post fills that gap neatly, using yet another isomorphism, and moreover an isomorphism that has been staring us in the face all along: the one between functions and tries.

I wonder:

  • Given how this trick shouts to be noticed, has it been discovered and written up?
  • How useful will higher-order memoization turn out to be?
  • How efficient is the straightforward implementation given above?
  • Can the conversions between isomorphic domain types be done inexpensively, perhaps eliminating many altogether?
  • How does [non-strict memoization][] fit in with higher-order memoization?

One Comment

  1. conal:

    Ah, I realized a problem in the technique described in this post. Given a higher-order function to memoize f :: (a -> b) -> c, if the type a (the domain type’s domain type) has infinitely many elements (e.g., Integer or [Bool]), then its corresponding trie t :: a :->: b will be infinitely large. In that case, the usual hyperstrict memoization will insist on evaluating the entire infinite trie. I accidentally avoided this problem in my post by choosing the finite type Bool for a.

    I hope & suspect that non-strict memoization will solve this problem. I’m working on the implementation now.

Leave a comment