## A trie for length-typed vectors

As you might have noticed, I’ve been thinking and writing about memo tries lately. I don’t mean to; they just keep coming up.

Memoization is the conversion of functions to data structures. A simple, elegant, and purely functional form of memoization comes from applying three common type isomorphisms, which also correspond to three laws of exponents, familiar from high school math, as noted by Ralf Hinze in his paper Generalizing Generalized Tries.

In Haskell, one can neatly formulate memo tries via an associated functor, `Trie`, with a convenient synonym "`k ↛ v`" for `Trie k v`, as in Elegant memoization with higher-order types. (Note that I’ve changed my pretty-printing from "`k :→: v`" to "`k ↛ v`".) The key property is that the data structure encodes (is isomorphic to) a function, i.e.,

``k ↛ a ≅ k → a``

In most cases, we ignore non-strictness, though there is a delightful solution for memoizing non-strict functions correctly.

My previous four posts explored use of types to statically bound numbers and to determine lengths of vectors.

Just as (infinite-only) streams are the natural trie for unary natural numbers, we saw in Reverse-engineering length-typed vectors that length-typed vectors (one-dimensional arrays) are the natural trie for statically bounded natural numbers.

``BNat n ↛ a ≡ Vec n a``

and so

``BNat n → a ≅ Vec n a``

In retrospect, this relationship is completely unsurprising, since a vector of length n is a collection of values, indexed by 0, . . . , n - 1.

In that same post, I noted that vectors are not only a trie for bounded numbers, but when the elements are also bounded numbers, the vectors can also be thought of as numbers. Both the number of digits and the number base are captured statically, in types:

``type Digits n b = Vec n (BNat b)``

The type parameters `n` and `b` here are type-encodigs of unary numbers, i.e., built up from zero and successor (`Z` and `S`). For instance, when `b ≡ S (S Z)`, we have n-bit binary numbers.

In this new post, I look at another question of tries and vectors. Given that `Vec n` is the trie for `BNat n`, is there also a trie for `Vec n`?

Edits:

• 2011-01-31: Switched trie notation to "`k ↛ v`" to avoid missing character on iPad.

### A trie for length-typed vectors

A vector is a trie over bounded numbers. What is a trie over vectors? As always, isomorphisms show us the way.

``Vec n a → b ≅ (a × ⋯ × a) → b             ≅ a → ⋯ → a → b             ≅ a ↛ ⋯ ↛ a ↛ b             ≡ Trie a (⋯ (Trie a b)⋯)             ≅ (Trie a ∘ ⋯ ∘ Trie a) b``

So the trie (functor) for `Vec n a` is the n-ary composition of tries for `a`:

``type VTrie n a = Trie a :^ n ``

where for any functor `f` and natural number type `n`,

``f :^ n ≅ f ∘ ⋯ ∘ f  -- (n times)``

### N-ary functor composition

Since composition is associative, a recursive formulation might naturally fold from the left or from right. (Or perhaps in a balanced tree, to facilitate parallel execution.)

#### Right-folded composition

Let's look at each fold direction, starting with the right, i.e.,

``f :^ Z   ≅ Idf :^ S n ≅ f ∘ (f :^ n)``

``data (:^) ∷ (* → *) → * → (* → *) where  ZeroC ∷                        a  → (f :^ Z) a  SuccC ∷ IsNat n ⇒ f ((f :^ n) a) → (f :^ (S n)) a``

Functors compose into functors and applicatives into applicatives. (See Applicative Programming with Effects (section 5) and the instance definitions in Semantic editor combinators.) The following definitions arise from the standard instances for binary functor composition.

``instance Functor f ⇒ Functor (f :^ n) where  fmap h (ZeroC a)  = ZeroC (h a)  fmap h (SuccC fs) = SuccC ((fmap∘fmap) h fs)``
``instance (IsNat n, Applicative f) ⇒ Applicative (f :^ n) where  pure = pureN nat  ZeroC f  ⊛ ZeroC x  = ZeroC (f x)  SuccC fs ⊛ SuccC xs = SuccC (liftA2 (⊛) fs xs)``
``pureN ∷ Applicative f ⇒ Nat n → a → (f :^ n) apureN Zero     a = ZeroC apureN (Succ _) a = SuccC ((pure ∘ pure) a)``

More explicitly, the second `pure` could instead use `pureN`:

``pureN (Succ n) a = SuccC ((pure ∘ pureN n) a)``
##### Some tidier definitions

Using `(⊔)`, there are tidier definitions of `fmap` and `(⊛)`:

``  fmap = inZeroC  (\$) ⊔ inSuccC  (fmap∘fmap)  (⊛)  = inZeroC2 (\$) ⊔ inSuccC2 (liftA2 (⊛))``

where the new combinators are partial functions that work inside of `ZeroC` and `SuccC`.

``inZeroC  h (ZeroC a ) = ZeroC (h a )inSuccC  h (SuccC as) = SuccC (h as)``
``inZeroC2 h (ZeroC a ) (ZeroC b ) = ZeroC (h a  b )inSuccC2 h (SuccC as) (SuccC bs) = SuccC (h as bs)``

This example demonstrates another notational benefit of `(⊔)`, extending the techniques in the post Lazier function definitions by merging partial values.

#### Left-folded composition

For left-folded composition, a tiny change suffices in the `S` case:

``f :^ Z   ≅ Idf :^ S n ≅ (f :^ n) ∘ f``

which translates to a correspondingly tiny change in the `SuccC` constructor.

``data (:^) ∷ (* → *) → * → (* → *) where  ZeroC ∷                        a  → (f :^ Z) a  SuccC ∷ IsNat n ⇒ (f :^ n) (f a) → (f :^ (S n)) a``

The `Functor` and `Applicative` instances are completely unchanged.

### Vector tries (continued)

Using the analysis above, we can easily define tries over vectors as n-ary composition of tries over the vector element type. Again, there is a right-folded and a left-folded version.

#### Right-folded vector tries

``instance (IsNat n, HasTrie a) ⇒ HasTrie (Vec n a) where  type Trie (Vec n a) = Trie a :^ n``

Conversion from trie to function is, as always, a trie look-up. Its definition closely follows the definition of `f :^ n`:

``  ZeroC v `untrie` ZVec      = v  SuccC t `untrie` (a :< as) = (t `untrie` a) `untrie` as``

For `untrie`, we were able to follow the zero/successor structure of the trie. For `trie`, we don't have such a structure to follow, but we can play the same trick as for defining `units` above. Use the `nat` method of the `IsNat` class to synthesize a number of type `Nat n`, and then follow the structure of that number in a new recursive function definition.

``  trie = trieN nat``

where

``trieN ∷ HasTrie a ⇒ Nat n → (Vec n a → b) → (Trie a :^ n) btrieN Zero     f = ZeroC (f ZVec)trieN (Succ _) f = SuccC (trie (λ a → trie (f ∘ (a :<))))``

#### Left-folded vector tries

The change from right-folding to left-folding is minuscule.

``instance (IsNat n, HasTrie a) ⇒ HasTrie (Vec n a) where  type Trie (Vec n a) = Trie a :^ n  ZeroC b `untrie` ZVec      = b  SuccC t `untrie` (a :< as) = (t `untrie` as) `untrie` a    trie = trieN nat``
``trieN ∷ HasTrie a ⇒ Nat n → (Vec n a → b) → (Trie a :^ n) btrieN Zero     f = ZeroC (f ZVec)trieN (Succ _) f = SuccC (trie (λ as → trie (f ∘ (:< as))))``

#### Right vs Left?

There are two tries for `Vec n a`, but with Haskell we have to choose one as the `HasTrie` instance. How can we make our choice?

The same sort of question arises earlier. The post Reverse-engineering length-typed vectors showed how to discover `Vec n` by looking for the trie functor for the `BNat n`. The derivation was

``Vec n a ≅ a × ⋯ × a        ≅ (1 → a) × ⋯ × (1 → a)        ≅ (1 + ⋯ + 1) → a``

The question of associativity arises here as well, for both product and sum. The `BNat n a` and `Vec n` types both choose right-associativity:

``data BNat ∷ * → * where  BZero ∷          BNat (S n)  BSucc ∷ BNat n → BNat (S n)data Vec ∷ * → * → * where  ZVec ∷                Vec Z     a  (:<) ∷ a → Vec n a → Vec (S n) a``

Since trie construction is type-driven, I see the vector trie based on right-folded composition as in line with these definitions. Left-folding would come from a small initial change, swapping the constructors in the `BNat` definition:

``data BNat ∷ * → * where  BSucc ∷ BNat n → BNat (S n)  BZero ∷          BNat (S n)``

From this tweaked beginning, a `BNat` trie construction would lead to a left-associated `Vec` type:

``data Vec ∷ * → * → * where  (:<) ∷ Vec n a → a → Vec (S n) a  ZVec ∷                Vec Z     a``

And then left-folded composition for the `Vec` trie.