Fixing broken isomorphisms — details for non-strict memoization, part 2

The post Details for non-strict memoization, part 1 works out a systematic way of doing non-strict memoization, i.e., correct memoization of non-strict (and more broadly, non-hyper-strict) functions. As I mentioned at the end, there was an awkward aspect, which is that the purported “isomorphisms” used for regular types are not quite isomorphisms.

For instance, functions from triples are memoized by converting to and from nested pairs:

untriple ∷ (a,b,c) -> ((a,b),c)
untriple (a,b,c) = ((a,b),c)
 
triple ∷ ((a,b),c) -> (a,b,c)
triple ((a,b),c) = (a,b,c)

Then untriple and triple form an embedding/projection pair, i.e.,

triple ∘ untriple ≡ id
untriple ∘ triple ⊑ id

The reason for the inequality is that the nested-pair form permits (⊥,c), which does not correspond to any triple.

untriple (triple (,c)) ≡ untriple ⊥ ≡ ⊥

Can we patch this problem by simply using an irrefutable (lazy) pattern in the definition of triple, i.e., triple (~(a,b),c) = (a,b,c)? Let’s try:

untriple (triple (,c)) ≡ untriple (,,c)((,),c)

So isomorphism fails and so does even the embedding/projection property.

Similarly, to deal with regular algebraic data types, I used a class that describes regular data types as repeated applications of a single, associated pattern functor (following A Lightweight Approach to Datatype-Generic Rewriting):

class Functor (PF t) ⇒ Regular t where
  type PF t ∷ **
  unwrap ∷ t → PF t t
  wrap   ∷ PF t t → t

Here unwrap converts a value into its pattern functor form, and wrap converts back. For example, here is the Regular instance I had used for lists:

instance Regular [a] where
  type PF [a] = Const () :+: Const a :*: Id
 
  unwrap []     = InL (Const ())
  unwrap (a:as) = InR (Const a :*: Id as)
 
  wrap (InL (Const ()))          = []
  wrap (InR (Const a :*: Id as)) = a:as

Again, we have an embedding/projection pair, rather than a genuine isomorphism:

wrap ∘ unwrap ≡ id
unwrap ∘ wrap ⊑ id

The inequality comes from ⊥ values occurring in PF [a] [a] at type Const () [a], (), (Const a :*: Id) [a], Const a [a], or Id [a].

Why care?

What harm results from the lack of genuine isomorphism? For hyper-strict functions, as usually handled (correctly) in memoization, I don’t think there is any harm. For correct memoization of non-hyper-strict functions, however, the superfluous points of undefinedness lead to larger memo tries and wasted effort. For instance, a function from triples goes through some massaging on the way to being memoized:

λ (a,b,c) → ⋯
⇓
λ ((a,b),c) → ⋯
⇓
λ (a,b) → λ c → ⋯

For hyper-strict memoization, the next step transforms to λ a → λ b → λ c → ⋯. For non-strict memoization, however, we first stash away the value of the function applied to ⊥ ∷ (a,b), which will always be ⊥ in this context.

Strict products and sums

To eliminate the definedness discrepancy and regain isomorphism, we might make all non-strictness explicit via unlifted product & sums, and explicit lifting.

-- | Add a bottom to a type
data Lift a = Lift { unLift ∷ a } deriving Functor
 
infixl 6 :+:!
infixl 7 :*:!
 
-- | Strict pair
data a :*! b = !a :*! !b
 
-- | Strict sum
data a :+! b = Left' !a | Right' !b

Note that the Id and Const a functors used in canonical representations are already strict, as they’re defined via newtype.

With these new tools, we can decompose isomorphically. For instance,

(a,b,c) ≅ Lift a :*! Lift b :*! Lift c

with the isomorphism given by

untriple' ∷ (a,b,c) -> Lift a :*! Lift b :*! Lift c
untriple' (a,b,c) = Lift a :*! Lift b :*! Lift c
 
triple' ∷ Lift a :*! Lift b :*! Lift c -> (a,b,c)
triple' (Lift a :*! Lift b :*! Lift c) = (a,b,c)

For regular types, we’ll also want variations as functor combinators:

-- | Strict product functor
data (f :*:! g) a = !(f a) :*:! !(g a) deriving Functor
 
-- | Strict sum functor
data (f :+:! g) a = InL' !(f a) | InR' !(g a) deriving Functor

Then change the Regular instance on lists to the following:

instance Regular [a] where
  type PF [a] = Const () :+:! Const (Lift a) :*:! Lift
 
  unwrap []     = InL' (Const ())
  unwrap (a:as) = InR' (Const (Lift a) :*:! Lift as)
 
  wrap (InL' (Const ()))                    = []
  wrap (InR' (Const (Lift a) :*:! Lift as)) = a:as

I suppose it would be fairly straightforward to derive such instances for algebraic data types automatically via Template Haskell.

Tries for non-strict memoization

As in part 1, represent a non-strict memo trie for a function f ∷ k -> v as a value for f ⊥ and a strict (but not hyper-strict) memo trie for f:

type k :→: v = Trie v (k :→ v)

For non-strict sum domains, the strict memo trie was a pair of non-strict tries:

instance (HasTrie a, HasTrie b) ⇒ HasTrie (Either a b) where
  type STrie (Either a b) = Trie a :*: Trie b
  sTrie   f           = trie (f ∘ Left) :*: trie (f ∘ Right)
  sUntrie (ta :*: tb) = untrie ta `either` untrie tb

For non-strict product, the strict trie was a composition of non-strict tries:

instance (HasTrie a, HasTrie b) => HasTrie (a , b) where
  type STrie (a , b) = Trie a :. Trie b
  sTrie   f = O (fmap trie (trie (curry f)))
  sUntrie (O tt) = uncurry (untrie (fmap untrie tt))

What about strict sum and product domains? Since strict sums & products cannot contain ⊥ as their immediate components, we can omit the values corresponding to ⊥ for those components. That is, we can use pairs and compositions of strict tries instead.

instance (HasTrie a, HasTrie b) => HasTrie (a :+! b) where
  type STrie (a :+! b) = STrie a :*: STrie b
  sTrie   f           = sTrie (f . Left') :*: sTrie (f . Right')
  sUntrie (ta :*: tb) = sUntrie ta `either'` sUntrie tb
 
instance (HasTrie a, HasTrie b) => HasTrie (a :*! b) where
  type STrie (a :*! b) = STrie a :. STrie b
  sTrie   f      = O (fmap sTrie (sTrie (curry' f)))
  sUntrie (O tt) = uncurry' (sUntrie (fmap sUntrie tt))

I’ve also substituted versions of curry and uncurry for strict products and either for strict sums:

curry' ∷ (a :*! b -> c) -> (a -> b -> c)
curry' f a b = f (a :*! b)
 
uncurry' ∷ (a -> b -> c) -> ((a :*! b) -> c)
uncurry' f (a :*! b) = f a b
 
either' ∷ (a -> c) -> (b -> c) -> (a :+! b -> c)
either' f _ (Left'  a) = f a
either' _ g (Right' b) = g b

We’ll also need to handle the lifting functor. The type Lift a has an additional bottom. A strict function or trie over Lift a is only strict in the lower (outer) one. So a strict trie over Lift a is simply a non-strict trie over a.

instance HasTrie a => HasTrie (Lift a) where
  type STrie (Lift a) = Trie a
  sTrie   f = trie (f . Lift)
  sUntrie t = untrie t . unLift

Notice that this instance puts back exactly what was lost from memo tries when going from non-strict products and sums to strict products and sums. The reason for this relationship is explained in the following simple isomorphisms:

(a,b)      ≅ Lift a :*! Lift b
Either a b ≅ Lift a :+! Lift b

Then isomorphisms can then be used to implement memoize over non-strict products and sums via memoization over strict products and sums.

Higher-order memoization

The post Memoizing higher-order functions suggested a simple way to memoize functions over function-valued domains by using (as always) type isomorphisms. The isomorphism used is between functions and memo tries.

I gave one example in that post

ft1 ∷ (Bool → a)[a]
ft1 f = [f False, f True]

In retrospect, this example was a lousy choice, as it hides an important problem. The Bool type is finite, and so the corresponding trie type has only finitely large elements. For that reason, higher-order memoization can get away with the usual hyper-strict memoization.

If instead, we try memoizing a function of type (a → b) → c, where the type a has infinitely many elements (e.g., Integer or [Bool]), then we’ll have to memoize over the domain a :→: b (memo tries from a to b), which includes infinite elements. In that case, hyper-strict memoization blows up, so we’ll want to use non-strict memoization instead.

As mentioned above, the type of non-strict tries contains a value and a strict trie:

type k :→: v = Trie v (k :→ v)

I thought I’d memoize by mapping to & from the isomorphic pair type (v, k :→ v). However, now I’m not satisfied with this mapping. A non-strict trie from k to v is not just any such pair of v and k :→ v. Monotonicity requires that the single v value (for ⊥) be a lower bound (information-wise) of every v in the trie. Ignoring this constraint would lead to a trie in which most of the entries do not correspond to any non-strict memo trie.

Puzzle: Can this constraint be captured as a static type in modern Haskell’s (GHC’s) type system (i.e., without resorting to general dependent typing)? I don’t know the answer.

Memoizing abstract types

This problem is more wide-spread still. Whenever there are constraints on a representation beyond what is expressed directly and statically in the representation type, we will have this same sort of isomorphism puzzle. Can we capture the constraint as a Haskell type? When we cannot, what do we do?

If we didn’t care about efficiency, I think we could ignore the issue, and everything else in this blog post, and accept making memo tries that are much larger than necessary. Although laziness will keep from filling in range values for unaccessed domain values, I worry that there will be quite a lot time and space wasted navigating past large portions of unusable trie structure.

Leave a comment