A trie for length-typed vectors

As you might have noticed, I’ve been thinking and writing about memo tries lately. I don’t mean to; they just keep coming up.

Memoization is the conversion of functions to data structures. A simple, elegant, and purely functional form of memoization comes from applying three common type isomorphisms, which also correspond to three laws of exponents, familiar from high school math, as noted by Ralf Hinze in his paper Generalizing Generalized Tries.

In Haskell, one can neatly formulate memo tries via an associated functor, Trie, with a convenient synonym "k ↛ v" for Trie k v, as in Elegant memoization with higher-order types. (Note that I’ve changed my pretty-printing from "k :→: v" to "k ↛ v".) The key property is that the data structure encodes (is isomorphic to) a function, i.e.,

k ↛ a  k → a

In most cases, we ignore non-strictness, though there is a delightful solution for memoizing non-strict functions correctly.

My previous four posts explored use of types to statically bound numbers and to determine lengths of vectors.

Just as (infinite-only) streams are the natural trie for unary natural numbers, we saw in Reverse-engineering length-typed vectors that length-typed vectors (one-dimensional arrays) are the natural trie for statically bounded natural numbers.

BNat n ↛ a ≡ Vec n a

and so

BNat n → a  Vec n a

In retrospect, this relationship is completely unsurprising, since a vector of length n is a collection of values, indexed by 0, . . . , n - 1.

In that same post, I noted that vectors are not only a trie for bounded numbers, but when the elements are also bounded numbers, the vectors can also be thought of as numbers. Both the number of digits and the number base are captured statically, in types:

type Digits n b = Vec n (BNat b)

The type parameters n and b here are type-encodigs of unary numbers, i.e., built up from zero and successor (Z and S). For instance, when b ≡ S (S Z), we have n-bit binary numbers.

In this new post, I look at another question of tries and vectors. Given that Vec n is the trie for BNat n, is there also a trie for Vec n?

Edits:

  • 2011-01-31: Switched trie notation to "k ↛ v" to avoid missing character on iPad.

A trie for length-typed vectors

A vector is a trie over bounded numbers. What is a trie over vectors? As always, isomorphisms show us the way.

Vec n a → b  (a × ⋯ × a) → b
a → ⋯ → a → b
a ↛ ⋯ ↛ a ↛ b
Trie a (⋯ (Trie a b)⋯)
(Trie a ∘ ⋯ ∘ Trie a) b

So the trie (functor) for Vec n a is the n-ary composition of tries for a:

type VTrie n a = Trie a :^ n 

where for any functor f and natural number type n,

f :^ n  f ∘ ⋯ ∘ f  -- (n times)

N-ary functor composition

Since composition is associative, a recursive formulation might naturally fold from the left or from right. (Or perhaps in a balanced tree, to facilitate parallel execution.)

Right-folded composition

Let's look at each fold direction, starting with the right, i.e.,

f :^ Z    Id
f :^ S n f ∘ (f :^ n)

Writing as a GADT:

data (:^)  (**) → * → (**) where
ZeroC a → (f :^ Z) a
SuccC IsNat n ⇒ f ((f :^ n) a) → (f :^ (S n)) a

Functors compose into functors and applicatives into applicatives. (See Applicative Programming with Effects (section 5) and the instance definitions in Semantic editor combinators.) The following definitions arise from the standard instances for binary functor composition.

instance Functor f ⇒ Functor (f :^ n) where
fmap h (ZeroC a) = ZeroC (h a)
fmap h (SuccC fs) = SuccC ((fmap∘fmap) h fs)
instance (IsNat n, Applicative f) ⇒ Applicative (f :^ n) where
pure = pureN nat
ZeroC f ⊛ ZeroC x = ZeroC (f x)
SuccC fs ⊛ SuccC xs = SuccC (liftA2 (⊛) fs xs)
pureN  Applicative f ⇒ Nat n → a → (f :^ n) a
pureN Zero a = ZeroC a
pureN (Succ _) a = SuccC ((pure ∘ pure) a)

More explicitly, the second pure could instead use pureN:

pureN (Succ n) a = SuccC ((pure ∘ pureN n) a)
Some tidier definitions

Using (⊔), there are tidier definitions of fmap and (⊛):

  fmap = inZeroC  ($) ⊔ inSuccC  (fmap∘fmap)

(⊛) = inZeroC2 ($) ⊔ inSuccC2 (liftA2 (⊛))

where the new combinators are partial functions that work inside of ZeroC and SuccC.

inZeroC  h (ZeroC a ) = ZeroC (h a )

inSuccC h (SuccC as) = SuccC (h as)
inZeroC2 h (ZeroC a ) (ZeroC b ) = ZeroC (h a  b )

inSuccC2 h (SuccC as) (SuccC bs) = SuccC (h as bs)

This example demonstrates another notational benefit of (⊔), extending the techniques in the post Lazier function definitions by merging partial values.

Left-folded composition

For left-folded composition, a tiny change suffices in the S case:

f :^ Z    Id
f :^ S n (f :^ n) ∘ f

which translates to a correspondingly tiny change in the SuccC constructor.

data (:^)  (**) → * → (**) where
ZeroC a → (f :^ Z) a
SuccC IsNat n ⇒ (f :^ n) (f a) → (f :^ (S n)) a

The Functor and Applicative instances are completely unchanged.

Vector tries (continued)

Using the analysis above, we can easily define tries over vectors as n-ary composition of tries over the vector element type. Again, there is a right-folded and a left-folded version.

Right-folded vector tries

instance (IsNat n, HasTrie a) ⇒ HasTrie (Vec n a) where
type Trie (Vec n a) = Trie a :^ n

Conversion from trie to function is, as always, a trie look-up. Its definition closely follows the definition of f :^ n:

  ZeroC v `untrie` ZVec      = v
SuccC t `untrie` (a :< as) = (t `untrie` a) `untrie` as

For untrie, we were able to follow the zero/successor structure of the trie. For trie, we don't have such a structure to follow, but we can play the same trick as for defining units above. Use the nat method of the IsNat class to synthesize a number of type Nat n, and then follow the structure of that number in a new recursive function definition.

  trie = trieN nat

where

trieN  HasTrie a ⇒ Nat n → (Vec n a → b) → (Trie a :^ n) b
trieN Zero f = ZeroC (f ZVec)
trieN (Succ _) f = SuccC (trie (λ a → trie (f ∘ (a :<))))

Left-folded vector tries

The change from right-folding to left-folding is minuscule.

instance (IsNat n, HasTrie a) ⇒ HasTrie (Vec n a) where
type Trie (Vec n a) = Trie a :^ n

ZeroC b `untrie` ZVec = b
SuccC t `untrie` (a :< as) = (t `untrie` as) `untrie` a

trie = trieN nat
trieN  HasTrie a ⇒ Nat n → (Vec n a → b) → (Trie a :^ n) b
trieN Zero f = ZeroC (f ZVec)
trieN (Succ _) f = SuccC (trie (λ as → trie (f ∘ (:< as))))

Right vs Left?

There are two tries for Vec n a, but with Haskell we have to choose one as the HasTrie instance. How can we make our choice?

The same sort of question arises earlier. The post Reverse-engineering length-typed vectors showed how to discover Vec n by looking for the trie functor for the BNat n. The derivation was

Vec n a  a × ⋯ × a
(1 → a) × ⋯ × (1 → a)
(1 ++ 1) → a

The question of associativity arises here as well, for both product and sum. The BNat n a and Vec n types both choose right-associativity:

data BNat  ** where
BZero BNat (S n)
BSucc BNat n → BNat (S n)

data Vec *** where
ZVec Vec Z a
(:<) a → Vec n a → Vec (S n) a

Since trie construction is type-driven, I see the vector trie based on right-folded composition as in line with these definitions. Left-folding would come from a small initial change, swapping the constructors in the BNat definition:

data BNat  ** where
BSucc BNat n → BNat (S n)
BZero BNat (S n)

From this tweaked beginning, a BNat trie construction would lead to a left-associated Vec type:

data Vec  *** where
(:<) Vec n a → a → Vec (S n) a
ZVec Vec Z a

And then left-folded composition for the Vec trie.

3 Comments

  1. Conal Elliott » Blog Archive » Deriving parallel tree scans:

    […] to extend the derivation to depth-typed, perfectly balanced trees, of the sort I played with in A trie for length-typed vectors and From tries to trees. The functions initTs and tailTs make unbalanced trees out of balanced […]

  2. Conal Elliott » Blog Archive » Composable parallel scanning:

    […] of inits and tails on lists) to depth-typed, perfectly balanced trees, of the sort I played with in A trie for length-typed vectors and From tries to trees. The difficulty I encounter is that the functions initTs and tailTs make […]

  3. Conal Elliott » Blog Archive » A third view on trees:

    […] Alternatively, make the tree sizes explicit in the types, as in a few recent posts, including A trie for length-typed vectors. (In those posts, I used the terms "right-folded" and "left-folded" in place of […]

Leave a comment