Conal Elliott » isomorphism

Memoizing higher-order functions

Conal — Wed, 21 Jul 2010 15:41:17 +0000

Memoization incrementally converts functions into data structures. It pays off when a function is repeatedly applied to the same arguments and applying the function is more expensive than accessing the corresponding data structure.

In lazy functional memoization, the conversion from function to data structure happens all at once from a denotational perspective, and incrementally from an operational perspective. See Elegant memoization with functional memo tries and Elegant memoization with higher-order types.

As Ralf Hinze presented in Generalizing Generalized Tries, trie-based memoization follows from three simple isomorphisms involving functions types:

1 → a ≅ a

(a + b) → c ≅ (a → c) × (b → c)

(a × b) → c ≅ a → (b → c)

which correspond to the familiar laws of exponents

a ^ 1 = a

c^a + b = c^a × c^b

c^a × b = (c^b)^a

When applied as a transformation from left to right, each law simplifies the domain part of a function type. Repeated application of the rules then eliminate all function types or reduce them to functions of atomic types. These atomic domains are eliminated as well by additional mappings, such as between a natural number and a list of bits (as in patricia trees). Algebraic data types corresponding to sums of products and so are eliminated by the sum and product rules. Recursive algebraic data types (lists, trees, etc) give rise to correspondingly recursive trie types.

So, with a few simple and familiar rules, we can memoize functions over an infinite variety of common types. Have we missed any?

Yes. What about functions over functions?

Edits:

2010-07-22: Made the memoization example polymorphic and switched from pairs to lists. The old example accidentally coincided with a specialized version of trie itself.
2011-02-27: updated some notation

Tries

In Elegant memoization with higher-order types, I showed a formulation of functional memoization using functor combinators.

type k ↛ v = Trie k v

class HasTrie k where
  type Trie k ∷ * → *
  trie   ∷ (k → v) → (k ↛ v)
  untrie ∷ (k ↛ v) → (k → v)

I will describe higher-order memoization in terms of this formulation. I imagine it would also work out, though less elegantly, in the associated data types formulation described in Elegant memoization with functional memo tries.

Domain isomorphisms

Elegant memoization with higher-order types showed how to define a HasTrie instance in terms of the instance of an isomorphic type, e.g., reducing tuples to nested pairs or booleans to a sum of unit types. A C macro, HasTrieIsomorph encapsulates the domain isomorphism technique. For instance, to reduce triples to pairs:

HasTrieIsomorph( (HasTrie a, HasTrie b, HasTrie c), (a,b,c), ((a,b),c)
               , λ (a,b,c) → ((a,b),c), λ ((a,b),c) → (a,b,c))

This isomorphism technique applies as well to the standard functor combinators used for constructing tries (any many other purposes). Those combinators again:

data Const x a = Const x

data Id a = Id a

data (f × g) a = f a × g a

data (f + g) a = InL (f a) | InR (g a)

newtype (g ∘ f) a = O (g (f a))

and their trie definitions:

HasTrieIsomorph( HasTrie a, Const a x, a, getConst, Const )

HasTrieIsomorph( HasTrie a, Id a, a, unId, Id )

HasTrieIsomorph( (HasTrie (f a), HasTrie (g a))
               , (f × g) a, (f a,g a)
               , λ (fa × ga) → (fa,ga), λ (fa,ga) → (fa × ga) )

HasTrieIsomorph( (HasTrie (f a), HasTrie (g a))
               , (f + g) a, Either (f a) (g a)
               , eitherF Left Right, either InL InR )

HasTrieIsomorph( HasTrie (g (f a))
               , (g ∘ f) a, g (f a) , unO, O )

The eitherF function is a variation on either:

eitherF ∷ (f a → b) → (g a → b) → (f + g) a → b
eitherF p _ (InL fa) = p fa
eitherF _ q (InR ga) = q ga

Higher-order memoization

Now higher-order memoization is easy. Apply yet another isomorphism, this time between functions and tries: The trie and untrie methods are exactly the mappings we need.

HasTrieIsomorph( (HasTrie a, HasTrie (a ↛ b))
               , a → b, a ↛ b, trie, untrie)

So, to memoize a higher-order function f ∷ (a → b) → v, we only a trie type for a and one for a ↛ b. The latter (tries for trie-valued domains) are provided by the isomorphisms above, and additional ones.

Demo

Our sample higher-order function will take a function of booleans and yield its value at False and at True:

ft1 ∷ (Bool → a) → [a]
ft1 f = [f False, f True]

A sample input converts False to 0 and True to 1:

f1 ∷ Bool → Int
f1 False = 0
f1 True  = 1

A sample run without memoization:

*FunctorCombo.MemoTrie> ft1 f1
[0,1]

and one with memoization:

*FunctorCombo.MemoTrie> memo ft1 f1
[0,1]

To illustrate what's going on behind the scenes, the following definitions (all of which type-check) progressively reveal the representation of the underlying memo trie. Most steps result from inlining a single Trie definition (as well as switching between Trie k v and the synonymous form k ↛ v).

trie1a ∷ HasTrie a ⇒ (Bool → a) ↛ (a, a)
trie1a = trie ft1

trie1b ∷ HasTrie a ⇒ (Bool ↛ a) ↛ (a, a)
trie1b = trie1a

trie1c ∷ HasTrie a ⇒ (Either () () ↛ a) ↛ (a, a)
trie1c = trie1a

trie1d ∷ HasTrie a ⇒ ((Trie () × Trie ()) a) ↛ (a, a)
trie1d = trie1a

trie1e ∷ HasTrie a ⇒ (Trie () a, Trie () a) ↛ (a, a)
trie1e = trie1a

trie1f ∷ HasTrie a ⇒ (() ↛ a, () ↛ a) ↛ (a, a)
trie1f = trie1a

trie1g ∷ HasTrie a ⇒ (a, a) ↛ (a, a)
trie1g = trie1a

trie1h ∷ HasTrie a ⇒ (Trie a ∘ Trie a) (a, a)
trie1h = trie1a

trie1i ∷ HasTrie a ⇒ a ↛ a ↛ (a, a)
trie1i = unO trie1a

Pragmatics

I'm happy with the correctness and elegance of the method in this post. It gives me the feeling of inevitable simplicity that I strive for -- obvious in hindsight. What about performance? After all, memoization is motivated by a desire to efficiency -- specifically, to reduce the cost of repeatedly applying the same function to the same argument, while keeping almost all of the modularity & simplicity of a naïve algorithm.

Memoization pays off when (a) a function is repeatedly applied to some arguments, and (b) when the cost of recomputing an application exceeds the cost of finding the previously computed result. (I'm over-simplifying here. Space efficiency matters also and can affect time efficiency.) The isomorphism technique used in this post and a previous one requires transforming an argument to the isomorphic type for each look-up and from the isomorphic type for each application. (I'm using "isomorphic type" to mean the type for which a HasTrie instance is already defined.) When these transformations are between function and trie form, I wonder how high the break-even threshold becomes.

How might we avoid these transformations, thus reducing the overhead of memoizing?

For conversion to isomorphic type during trie lookup, perhaps the cost could be reduced substantially through deforestation--inlining chains of untrie methods and applying optimizations to eliminate the many intermediate representation layers. GHC has gotten awfully good at this sort of thing. Maybe someone with more Haskell performance analysis & optimization experience than I have would be interested in collaborating.

For trie construction, I suspect the conversion back from the isomorphic type could be avoided by somehow holding onto the original form of the argument, before it was converted to the isomorphic type. I haven't attempted this idea yet.

Another angle on reducing the cost of the isomorphism technique is to use memoization! After all, if memoizing is worthwhile at all, there will be repeated applications of the memoized function to the same arguments. Exactly in such a case, the conversion of arguments to isomorphic form will also be done repeatedly for these same arguments. When a conversion is both expensive and repeated, we'd like to memoize. I don't know how to get off the ground with this idea, however. If I'm trying to memoize a function of type a → b, then the required conversion has type a → a' for some type a' with a HasTrie instance. Memoizing that conversion is just as hard as memoizing the function we started with.

Conclusion

Existing accounts of functional memoization I know of cover functions of the unit type, sums, and products, and they do so quite elegantly.

Type isomorphisms form the consistent, central theme in this work. Functions from unit, sums and products have isomorphic forms with simpler domain types (and so on, recursively). Additional isomorphisms extend these fundamental building blocks to many other types, including integer types and algebraic data types. However, functions over function-valued domains are conspicuously missing (though I hadn't noticed until recently). This post fills that gap neatly, using yet another isomorphism, and moreover an isomorphism that has been staring us in the face all along: the one between functions and tries.

I wonder:

Given how this trick shouts to be noticed, has it been discovered and written up?
How useful will higher-order memoization turn out to be?
How efficient is the straightforward implementation given above?
Can the conversions between isomorphic domain types be done inexpensively, perhaps eliminating many altogether?
How does [non-strict memoization][] fit in with higher-order memoization?

Elegant memoization with higher-order types

Conal — Wed, 21 Jul 2010 04:48:22 +0000

A while back, I got interested in functional memoization, especially after seeing some code from Spencer Janssen using the essential idea of Ralf Hinze’s paper Generalizing Generalized Tries. The blog post Elegant memoization with functional memo tries describes a library, MemoTrie, based on both of these sources, and using associated data types. I would have rather used associated type synonyms and standard types, but I couldn’t see how to get the details to work out. Recently, while playing with functor combinators, I realized that they might work for memoization, which they do quite nicely.

This blog post shows how functor combinators lead to an even more elegant formulation of functional memoization. The code is available as part of the functor-combo package.

The techniques in this post are not so much new as they are ones that have recently been sinking in for me. See Generalizing Generalized Tries, as well as Generic programming with fixed points for mutually recursive datatypes.

Edits:

2011-01-28: Fixed small typo: “b^^a^^” ⟼ “b^a“
2010-09-10: Corrected Const definition to use newtype instead of data.
2010-09-10: Added missing Unit type definition (as Const ()).

Tries as associated data type

The MemoTrie library is centered on a class HasTrie with an associated data type of tries (efficient indexing structures for memoized functions):

class HasTrie k where
    data (:→:) k :: * → *
    trie   :: (k  →  v) → (k :→: v)
    untrie :: (k :→: v) → (k  →  v)

The type a :→: b represents a trie that maps values of type a to values of type b. The trie representation depends only on a.

Memoization is a simple combination of these two methods:

memo :: HasTrie a ⇒ (a → b) → (a → b)
memo = untrie . trie

The HasTrie instance definitions correspond to isomorphisms invoving function types. The isomorphisms correspond to the familiar rules of exponents, if we translate a → b into b^a. (See Elegant memoization with functional memo tries for more explanation.)

instance HasTrie () where
    data () :→: x = UnitTrie x
    trie f = UnitTrie (f ())
    untrie (UnitTrie x) = const x

instance (HasTrie a, HasTrie b) ⇒ HasTrie (Either a b) where
    data (Either a b) :→: x = EitherTrie (a :→: x) (b :→: x)
    trie f = EitherTrie (trie (f . Left)) (trie (f . Right))
    untrie (EitherTrie s t) = either (untrie s) (untrie t)

instance (HasTrie a, HasTrie b) ⇒ HasTrie (a,b) where
    data (a,b) :→: x = PairTrie (a :→: (b :→: x))
    trie f = PairTrie (trie (trie . curry f))
    untrie (PairTrie t) = uncurry (untrie .  untrie t)

Functors and functor combinators

For notational convenience, let “(:→:)” be a synonym for “Trie“:

type k :→: v = Trie k v

And replace the associated data with an associated type.

class HasTrie k where
    type Trie k :: * → *
    trie   :: (k  →  v) → (k :→: v)
    untrie :: (k :→: v) → (k  →  v)

Then, imitating the three HasTrie instances above,

type Trie () v = v

type Trie (Either a b) v = (Trie a v, Trie b v)

type Trie (a,b) v = Trie a (Trie b v)

Imagine that we have type lambdas for writing higher-kinded types.

type Trie () = λ v → v

type Trie (Either a b) = λ v → (Trie a v, Trie b v)

type Trie (a,b) = λ v → Trie a (Trie b v)

Type lambdas are often written as “Λ” (capital “λ”) instead. In the land of values, these three right-hand sides correspond to common building blocks for functions, namely identity, product, and composition:

id      = λ v → v
f *** g = λ v → (f v, g v)
g  .  f = λ v → g (f v)

These building blocks arise in the land of types.

newtype Id a = Id a

data (f :*: g) a = f a :*: g a

newtype (g :. f) a = O (g (f a))

where Id, f and g are functors. Sum and a constant functor are also common building blocks:

data (f :+: g) a = InL (f a) | InR (g a)

newtype Const x a = Const x

type Unit = Const () -- one non-⊥ inhabitant

Tries as associated type synonym

Given these standard definitions, we can eliminate the special-purpose data types used, replacing them with our standard functor combinators:

instance HasTrie () where
  type Trie ()  = Id
  trie   f      = Id (f ())
  untrie (Id v) = const v

instance (HasTrie a, HasTrie b) => HasTrie (Either a b) where
  type Trie (Either a b) = Trie a :*: Trie b
  trie   f           = trie (f . Left) :*: trie (f . Right)
  untrie (ta :*: tb) = untrie ta `either` untrie tb

instance (HasTrie a, HasTrie b) ⇒ HasTrie (a , b) where
  type Trie (a , b) = Trie a :. Trie b
  trie   f      = O (trie (trie . curry f))
  untrie (O tt) = uncurry (untrie . untrie tt)

At first blush, it might appear that we’ve simply moved the data type definitions outside of the instances. However, the extracted functor combinators have other uses, as explored in polytypic programming. I’ll point out some of these uses in the next few blog posts.

Isomorphisms

Many types are isomorphic variations, and so their corresponding tries can share a common representation. For instance, triples are isomorphic to nested pairs:

detrip :: (a,b,c) → ((a,b),c)
detrip (a,b,c) = ((a,b),c)

trip :: ((a,b),c) → (a,b,c)
trip ((a,b),c) = (a,b,c)

A trie for triples can be a a trie for pairs (already defined). The trie and untrie methods then just perform conversions around the corresponding methods on pairs:

instance (HasTrie a, HasTrie b, HasTrie c) ⇒ HasTrie (a,b,c) where
    type Trie (a,b,c) = Trie ((a,b),c)
    trie f = trie (f . trip)
    untrie t = untrie t . detrip

All type isomorphisms can use this same pattern. I don’t think Haskell is sufficiently expressive to capture this pattern within the language, so I’ll resort to a C macro. There are five parameters:

Context: the instance context;
Type: the type whose instance is being defined;
IsoType: the isomorphic type;
toIso: conversion function to IsoType; and
fromIso: conversion function from IsoType.

The macro:

#define HasTrieIsomorph(Context,Type,IsoType,toIso,fromIso)  
instance Context ⇒ HasTrie (Type) where {  
  type Trie (Type) = Trie (IsoType);  
  trie f = trie (f . (fromIso));  
  untrie t = untrie t . (toIso);  
}

Now we can easily define HasTrie instances:

HasTrieIsomorph( (), Bool, Either () ()
               ,  c -> if c then Left () else Right ()
               , either ( () -> True) ( () -> False))

HasTrieIsomorph( (HasTrie a, HasTrie b, HasTrie c), (a,b,c), ((a,b),c)
               , λ (a,b,c) → ((a,b),c), λ ((a,b),c) → (a,b,c))

HasTrieIsomorph( (HasTrie a, HasTrie b, HasTrie c, HasTrie d)
               , (a,b,c,d), ((a,b,c),d)
               , λ (a,b,c,d) → ((a,b,c),d), λ ((a,b,c),d) → (a,b,c,d))

In most (but not all) cases, the first argument (Context) could simply be that the isomorphic type HasTrie, e.g.,

HasTrieIsomorph( HasTrie ((a,b),c), (a,b,c), ((a,b),c)
               , λ (a,b,c) → ((a,b),c), λ ((a,b),c) → (a,b,c))

We could define another macro that captures this pattern and requires one fewer argument. On the other hand, there is merit to keeping the contextual requirements explicit.

Regular data types

A regular data type is one in which the recursive uses are at the same type. Functions over such types are often defined via monomorphic recursion. Data types that do not satisfy this constraint are called “nested“.

As in several recent generic programming systems, regular data types can be encoded generically through a type class that unwraps one level of functor from a type. The regular data type is the fixpoint of that functor. See, e.g., Polytypic programming in Haskell. Adopting the style of A Lightweight Approach to Datatype-Generic Rewriting,

class Functor (PF t) ⇒ Regular t where
  type PF t :: * → *
  wrap   :: PF t t → t
  unwrap :: t → PF t t

Here “PF” stands for “pattern functor”.

The pattern functors can be constructed out of the functor combinators above. For instance, a list at the top level is either empty or a value and a list. Translating this description:

instance Regular [a] where
  type PF [a] = Unit :+: Const a :*: Id

  unwrap []     = InL (Const ())
  unwrap (a:as) = InR (Const a :*: Id as)

  wrap (InL (Const ()))          = []
  wrap (InR (Const a :*: Id as)) = a:as

As another example, consider rose trees ([Data.Tree][]):

data Tree  a = Node a [Tree a]

instance Regular (Tree a) where

  type PF (Tree a) = Const a :*: []

  unwrap (Node a ts) = Const a :*: ts

  wrap (Const a :*: ts) = Node a ts

Regular types allow for even more succinct HasTrie instance implementations. Specialize HasTrieIsomorph further:

#define HasTrieRegular(Context,Type)  
HasTrieIsomorph(Context, Type, PF (Type) (Type) , unwrap, wrap)

For instance, for lists and rose trees:

HasTrieRegular(HasTrie a, [a])
HasTrieRegular(HasTrie a, Tree a)

The HasTrieRegular macro could be specialized even further for single-parameter polymorphic data types:

#define HasTrieRegular1(TypeCon) HasTrieRegular(HasTrie a, TypeCon a)

HasTrieRegular1([])
HasTrieRegular1(Tree)

You might wonder if I’m cheating here, by claiming very simple trie specifications when I’m really just shuffling code around. After all, the complexity removed from HasTrie instances shows up in Regular instances. The win in making this shuffle is that Regular is handy for other purposes, as illustrated in Generic programming with fixed points for mutually recursive datatypes (including fold, unfold, and fmap). (More examples in A Lightweight Approach to Datatype-Generic Rewriting.)

Trouble

Sadly, these elegant trie definitions have a problem. Trying to compile them leads to a error message from GHC. For instance,

Nested type family application
  in the type family application: Trie (PF [a] [a])
(Use -XUndecidableInstances to permit this)

Adding UndecidableInstances silences this error message, but leads to nontermination in the compiler.

Expanding definitions, I can see the likely cause of nontermination. The definition in terms of a type family allows an infinite type to sneak through, and I guess GHC’s type checker is unfolding infinitely.

As a simpler example:

{-# LANGUAGE TypeFamilies, UndecidableInstances #-}

type family List a :: *

type instance List a = Either () (a, List a)

-- Hangs ghc 6.12.1:
nil :: List a
nil = Left ()

A solution

Since GHC’s type-checker cannot handle directly recursive types, perhaps we can use a standard avoidance strategy, namely introducing a newtype or data definition to break the cycle. For instance, as a trie for [a], we got into trouble by using the trie of the unwrapped form of [a], i.e., Trie (PF [a] [a]). So instead,

newtype ListTrie a v = ListTrie (Trie (PF [a] [a]) v)

which is to say

newtype ListTrie a v = ListTrie (PF [a] [a] :→: v)

Now wrap and unwrap as before, and add & remove ListTrie as needed:

instance HasTrie a ⇒ HasTrie [a] where
  type Trie [a] = ListTrie a
  trie f = ListTrie (trie (f . wrap))
  untrie (ListTrie t) = untrie t . unwrap

Again, abstract the boilerplate code into a C macro:

#define HasTrieRegular(Context,Type,TrieType,TrieCon) 
newtype TrieType v = TrieCon (PF (Type) (Type) :→: v); 
instance Context ⇒ HasTrie (Type) where { 
  type Trie (Type) = TrieType; 
  trie f = TrieCon (trie (f . wrap)); 
  untrie (TrieCon t) = untrie t . unwrap; 
}

For instance,

HasTrieRegular(HasTrie a, [a] , ListTrie a, ListTrie)
HasTrieRegular(HasTrie a, Tree, TreeTrie a, TreeTrie)

Again, simplify a bit with a specialization to unary regular types:

#define HasTrieRegular1(TypeCon,TrieCon) 
HasTrieRegular(HasTrie a, TypeCon a, TrieCon a, TrieCon)

And then use the following declarations instead:

HasTrieRegular1([]  , ListTrie)
HasTrieRegular1(Tree, TreeTrie)

Similarly for binary etc as needed.

The second macro parameter (TrieCon) is just a name, which I don’t to be used other than in the macro-generated code. It could be eliminated, if there were a way to gensym the name. Perhaps with Template Haskell?

Conclusion

I like the elegance of constructing memo tries in terms of common functor combinators. Standard pattern functors allow for extremely succinct trie specifications for regular data types. However, these specifications lead to nontermination of the type checker, which can then be avoided by the standard trick of introducing a newtype to break type recursion. As often, this trick brings introduces some clumsiness. Perhaps the problem can also be avoided by using a formulation using bifunctors, as in Design Patterns as Higher-Order Datatype-Generic Programs and Polytypic programming in Haskell, which allows the fixed-point nature of regular data types to be exposed.