Elegant memoization with functional memo tries
Function memoization goes back at least as far as Donald Michie’s 1968 paper. The idea is to stash away results of a function call for reuse when a function is called more than once with the same argument. Given an argument, the memoized function looks up the argument in the internal memo table (often a hash table). If found, the previously computed result is reused. Otherwise, the function is applied, and the result is stored in the table, keyed by the argument. The table and its mutation are kept private from clients of the memo function.
Perhaps surprisingly, memoization can be implemented simply and purely functionally in a lazy functional language. Laziness allows the implementation to build the memo table once and for all, filling in all the results for the function at all domain values. Thanks to laziness, the values don’t actually get computed until they’re used. As usual with lazy data structures, once a component has been evaluated, future accesses come for free.
The implementation described in this post is based one I got from Spencer Janssen. It uses the essential idea of Ralf Hinze’s paper Generalizing Generalized Tries. The library is available on Hackage and a darcs repository. See the wiki page and hackage page for documentation and source code. I’ve compiled it successfully with ghc versions 6.8.2 through 6.10.3.
Edits:
- 2009-02-12: Fixed typo.
- 2009-11-14: Hackage page pointer.
- 2009-11-20: Fixed source code pointer.
Memo tries
The central idea of the function memoizer is associating a type of trie to each domain type want to memoize over.
class HasTrie a where
data (↛) a ∷ * → *
where a ↛ b
represents a trie that maps values of type a
to values of type b
. The trie representation depends only on a
. For more information on the recent "associated types" feature of Haskell, see Associated Types with Class.
In addition to the trie type, the HasTrie
class contains converters between functions and tries:
trie ∷ (a → b) → (a ↛ b)
untrie ∷ (a ↛ b) → (a → b)
The trie
and untrie
methods must be inverses:
trie ∘ untrie ≡ id
untrie ∘ trie ≡ id
Given the HasTrie
class, memoization is trivial to implement:
memo ∷ HasTrie a ⇒ (a → b) → (a → b)
memo = untrie ∘ trie
The second inverse property implies that memo is the identity function (semantically).
Multi-argument curried functions can be memoized by repeated uses of memo
. See the source code.
Some trie representations
Now let's consider choices for trie representations. As in Generalizing Generalized Tries, the key is to exploit some type isomorphisms.
1 → a ≅ a
(a + b) → c ≅ (a → c) × (b → c)
(a × b) → c ≅ a → (b → c)
which correspond to the familiar laws of exponents
a1 = a
ca + b = ca × cb
ca × b = (cb)a
Writing these type isomorphisms in Haskell:
(() → a) ≅ a
(Either a b → c) ≅ (a → c, b → c)
((a,b) → c) ≅ (a → (b → c))
Start with the simplest domain, namely ()
. Since there is only one (non-bottom) value of type ()
, a trie over ()
contains just a single value, as suggested in the isomorphism equation.
instance HasTrie () where
data () ↛ a = UnitTrie a
trie f = UnitTrie (f ())
untrie (UnitTrie a) = const a
A trie for sum-valued domains contains two tries:
instance (HasTrie a, HasTrie b) ⇒ HasTrie (Either a b) where
data (Either a b) ↛ x = EitherTrie (a ↛ x) (b ↛ x)
trie f = EitherTrie (trie (f ∘ Left)) (trie (f ∘ Right))
untrie (EitherTrie s t) = either (untrie s) (untrie t)
While a trie for pair-valued domains contains nested tries:
instance (HasTrie a, HasTrie b) ⇒ HasTrie (a,b) where
data (a,b) ↛ x = PairTrie (a ↛ (b ↛ x))
trie f = PairTrie (trie (trie ∘ curry f))
untrie (PairTrie t) = uncurry (untrie ∘ untrie t)
Exercise: Prove that trie
and untrie
are inverses of each other (in both directions) for the definitions above. My proofs are in the source code.
Tries for other types, such as Bool
, can be implemented via isomorphisms using the types above, or directly. For instance, an unsigned word can be converted to a list of booleans:
instance HasTrie Word where
data Word ↛ a = WordTrie ([Bool] ↛ a)
trie f = WordTrie (trie (f ∘ unbits))
untrie (WordTrie t) = untrie t ∘ bits
where
bits ∷ Bits t ⇒ t → [Bool]
unbits ∷ Bits t ⇒ [Bool] → t
What's coming?
I hope you've gotten a taste of the elegance this functional means of making memo functions. Future blog posts will include:
- algebraic construction of memo tries, using type class morphisms, and
- the use of memo tries for simple and efficient representations of general linear maps.
Conal Elliott » Blog Archive » Composing memo tries:
[…] About « Elegant memoization with functional memo tries […]
15 October 2008, 6:18 pmLuke Palmer:
We will see where you are going this, but I have two comments.
(1) So far, I don’t believe associated types are necessary. HasMemo should be implementable with the simple interface:
All the compositions you have used, and every other composition I have been able to think of, are implementable with this structure.
(2) This one is more subjective, but I don’t think the typeclass approach to memoization is a good one. Here you are limited to one memo trie per type. But there are many ways to memoize a given type! Indeed, you could use newtypes, but that’s clunky and a lot of work. data-memocombinators, which I just released on hackage two days ago (uh… are you and I the same person? :-), exposes memoizers as first-class functions so that they can be custom built; i.e. so you can memoize only integers less than 1,000,000 (the best way to solve one of the Euler problems), or only memoize the Left side of an Either (let’s say the right doesn’t have a trie), etc.
I think competition on Hackage will be nice, to see which approach ends up being more useful. The two approaches can be trivially combined by picking a default memoizer for each type. I wonder if there is a deeper, more flexible way to combine them.
16 October 2008, 12:29 amLuke Palmer:
Oh, your semantics are not quite right. Memoization changes the strictness of functions.
I am not quite sure in what essential way the strictness is changed. For example, it is theoretically possible to memoize a function of infinite streams, given that the return type is compact, so it’s not just that it makes it fully strict in its argument’s domain. I haven’t gotten around to implementing this awesome possibility just yet
16 October 2008, 12:42 amconal:
Yow! It sure does.
Oh, I see the problem:
Hmm. I can fix the
f
case by changing the nonstrictconst a
to the strict() -> a
in the definition ofuntrie (UnitTrie a)
. But then I run into the dual problem:Any thoughts on this dilemma?
16 October 2008, 9:11 amconal:
Yep. A problem with typeclasses in general. Even more generally, a problem with hiding implementation mechanism. Prettier but less flexible. I gather you’ve found the flexibility particularly useful with memoization.
Sounds awfully cool. I’d love to read a blog post about your library, especially where you exploit the flexibility of explicit memoizers.
16 October 2008, 9:21 amDave Menendez:
Well, there’s not much point memoizing a constant function. Also, a strict instance for () is more consistent with the other instances.
16 October 2008, 1:27 pmconal:
I said to Luke
Oh! Now I see that you have written. I’d still like to see some nice examples if you get around to it.
16 October 2008, 3:52 pmConal Elliott » Blog Archive » Vector space bases via type families:
[…] rules are also essentially the same as the ones used for memo tries, but phrased in terms of logarithms instead of (explicit) exponents. Tags: type families, vector […]
19 October 2008, 5:23 pmConal Elliott » Blog Archive » Simpler, more efficient, functional linear maps:
[…] However, there are several other ways to make linear maps, and it would be easy to forget to memoize each combining form. So, instead of the function representation above, I ensure that the function be memoized by representing it as a memo trie. […]
9 November 2008, 2:16 pmConal Elliott » Blog Archive » Memoizing polymorphic functions - part one:
[…] reuses rather than recomputes when applied to the same argument more than once. Variations include not-quite-equivalence due to added strictness, and replacing value equality with pointer […]
10 June 2009, 4:36 pmShawn Willden:
Your source code links are broken. Could you please fix them so I can understand what it is you’re talking about? Thanks.
13 November 2009, 11:13 pmconal:
@Shawn — Fixed now. Thanks for the note.
14 November 2009, 11:22 amConal Elliott » Blog Archive » Memoizing polymorphic functions - part two:
[…] Comments conal on Elegant memoization with functional memo triesShawn Willden on Elegant memoization with functional memo triesDave Crossland on Seeking advice on […]
20 November 2009, 11:33 amConal Elliott » Blog Archive » Nonstrict memoization:
[…] written a few posts about functional memoization. In one of them, Luke Palmer commented that the memoization methods are correct only for strict functions, which I […]
13 July 2010, 6:46 pmConal Elliott » Blog Archive » Elegant memoization with higher-order types:
[…] using the essential idea of Ralf Hinze’s paper Generalizing Generalized Tries. The blog post Elegant memoization with functional memo tries describes a library, MemoTrie, based on both of these sources, and using associated data types. I […]
20 July 2010, 8:48 pmConal Elliott » Blog Archive » Memoizing higher-order functions:
[…] all at once from a denotational perspective, and incrementally from an operational perspective. See Elegant memoization with functional memo tries and Elegant memoization with higher-order […]
21 July 2010, 7:41 amConal Elliott » Blog Archive » Memoizing polymorphic functions via unmemoization:
[…] can work out an answer by appealing to the same laws of exponents used in memoization, but now applied in reverse (to create function types instead of eliminate […]
2 October 2010, 7:40 amAudun:
Hi, I had use for a HasTrie instance for the Void type. I could think of two instances. The instance that would seem to follow from the isomorphisms is:
instance HasTrie Void where data Void :->: a = VoidTrie trie _ = VoidTrie untrie VoidTrie = absurd enumerate VoidTrie = []
That is, there are no functions Void -> a, and a^Void is isomorphic to Unit. However, this instance makes constant functions strict, so that memo (const a) undefined == undefined. This can be alleviated with an instance closer to the Unit instance.
Any thoughts? Would you be willing to add a Void instance to MemoTrie?
21 April 2012, 11:26 amconal:
Hi Audun. I like your idea of a
HasTrie Void
instance. Omitting it was an oversight. I just added your instance and released version 0.4.12.I think you meant “there is one function
6 May 2012, 3:12 pmVoid -> a
” (rather than none), ignoring bottoms, namely the empty function.Audun:
Thank you for adding the instance, and for the attribution
If you make another release, you can add my full name: Audun Skaugen (no hurry, though).
Oh, and sorry for the code formatting in the previous comment; that’s what you get for not previewing.
29 May 2012, 3:12 am