A trie for length-typed vectors
As you might have noticed, I’ve been thinking and writing about memo tries lately. I don’t mean to; they just keep coming up.
Memoization is the conversion of functions to data structures. A simple, elegant, and purely functional form of memoization comes from applying three common type isomorphisms, which also correspond to three laws of exponents, familiar from high school math, as noted by Ralf Hinze in his paper Generalizing Generalized Tries.
In Haskell, one can neatly formulate memo tries via an associated functor, Trie
, with a convenient synonym "k ↛ v
" for Trie k v
, as in Elegant memoization with higher-order types. (Note that I’ve changed my pretty-printing from "k :→: v
" to "k ↛ v
".) The key property is that the data structure encodes (is isomorphic to) a function, i.e.,
k ↛ a ≅ k → a
In most cases, we ignore non-strictness, though there is a delightful solution for memoizing non-strict functions correctly.
My previous four posts explored use of types to statically bound numbers and to determine lengths of vectors.
Just as (infinite-only) streams are the natural trie for unary natural numbers, we saw in Reverse-engineering length-typed vectors that length-typed vectors (one-dimensional arrays) are the natural trie for statically bounded natural numbers.
BNat n ↛ a ≡ Vec n a
and so
BNat n → a ≅ Vec n a
In retrospect, this relationship is completely unsurprising, since a vector of length n is a collection of values, indexed by 0, . . . , n - 1.
In that same post, I noted that vectors are not only a trie for bounded numbers, but when the elements are also bounded numbers, the vectors can also be thought of as numbers. Both the number of digits and the number base are captured statically, in types:
type Digits n b = Vec n (BNat b)
The type parameters n
and b
here are type-encodigs of unary numbers, i.e., built up from zero and successor (Z
and S
). For instance, when b ≡ S (S Z)
, we have n-bit binary numbers.
In this new post, I look at another question of tries and vectors. Given that Vec n
is the trie for BNat n
, is there also a trie for Vec n
?
Edits:
- 2011-01-31: Switched trie notation to "
k ↛ v
" to avoid missing character on iPad.
A trie for length-typed vectors
A vector is a trie over bounded numbers. What is a trie over vectors? As always, isomorphisms show us the way.
Vec n a → b ≅ (a × ⋯ × a) → b
≅ a → ⋯ → a → b
≅ a ↛ ⋯ ↛ a ↛ b
≡ Trie a (⋯ (Trie a b)⋯)
≅ (Trie a ∘ ⋯ ∘ Trie a) b
So the trie (functor) for Vec n a
is the n-ary composition of tries for a
:
type VTrie n a = Trie a :^ n
where for any functor f
and natural number type n
,
f :^ n ≅ f ∘ ⋯ ∘ f -- (n times)
N-ary functor composition
Since composition is associative, a recursive formulation might naturally fold from the left or from right. (Or perhaps in a balanced tree, to facilitate parallel execution.)
Right-folded composition
Let's look at each fold direction, starting with the right, i.e.,
f :^ Z ≅ Id
f :^ S n ≅ f ∘ (f :^ n)
Writing as a GADT:
data (:^) ∷ (* → *) → * → (* → *) where
ZeroC ∷ a → (f :^ Z) a
SuccC ∷ IsNat n ⇒ f ((f :^ n) a) → (f :^ (S n)) a
Functors compose into functors and applicatives into applicatives. (See Applicative Programming with Effects (section 5) and the instance definitions in Semantic editor combinators.) The following definitions arise from the standard instances for binary functor composition.
instance Functor f ⇒ Functor (f :^ n) where
fmap h (ZeroC a) = ZeroC (h a)
fmap h (SuccC fs) = SuccC ((fmap∘fmap) h fs)
instance (IsNat n, Applicative f) ⇒ Applicative (f :^ n) where
pure = pureN nat
ZeroC f ⊛ ZeroC x = ZeroC (f x)
SuccC fs ⊛ SuccC xs = SuccC (liftA2 (⊛) fs xs)
pureN ∷ Applicative f ⇒ Nat n → a → (f :^ n) a
pureN Zero a = ZeroC a
pureN (Succ _) a = SuccC ((pure ∘ pure) a)
More explicitly, the second pure
could instead use pureN
:
pureN (Succ n) a = SuccC ((pure ∘ pureN n) a)
Some tidier definitions
Using (⊔)
, there are tidier definitions of fmap
and (⊛)
:
fmap = inZeroC ($) ⊔ inSuccC (fmap∘fmap)
(⊛) = inZeroC2 ($) ⊔ inSuccC2 (liftA2 (⊛))
where the new combinators are partial functions that work inside of ZeroC
and SuccC
.
inZeroC h (ZeroC a ) = ZeroC (h a )
inSuccC h (SuccC as) = SuccC (h as)
inZeroC2 h (ZeroC a ) (ZeroC b ) = ZeroC (h a b )
inSuccC2 h (SuccC as) (SuccC bs) = SuccC (h as bs)
This example demonstrates another notational benefit of (⊔)
, extending the techniques in the post Lazier function definitions by merging partial values.
Left-folded composition
For left-folded composition, a tiny change suffices in the S
case:
f :^ Z ≅ Id
f :^ S n ≅ (f :^ n) ∘ f
which translates to a correspondingly tiny change in the SuccC
constructor.
data (:^) ∷ (* → *) → * → (* → *) where
ZeroC ∷ a → (f :^ Z) a
SuccC ∷ IsNat n ⇒ (f :^ n) (f a) → (f :^ (S n)) a
The Functor
and Applicative
instances are completely unchanged.
Vector tries (continued)
Using the analysis above, we can easily define tries over vectors as n-ary composition of tries over the vector element type. Again, there is a right-folded and a left-folded version.
Right-folded vector tries
instance (IsNat n, HasTrie a) ⇒ HasTrie (Vec n a) where
type Trie (Vec n a) = Trie a :^ n
Conversion from trie to function is, as always, a trie look-up. Its definition closely follows the definition of f :^ n
:
ZeroC v `untrie` ZVec = v
SuccC t `untrie` (a :< as) = (t `untrie` a) `untrie` as
For untrie
, we were able to follow the zero/successor structure of the trie. For trie
, we don't have such a structure to follow, but we can play the same trick as for defining units
above. Use the nat
method of the IsNat
class to synthesize a number of type Nat n
, and then follow the structure of that number in a new recursive function definition.
trie = trieN nat
where
trieN ∷ HasTrie a ⇒ Nat n → (Vec n a → b) → (Trie a :^ n) b
trieN Zero f = ZeroC (f ZVec)
trieN (Succ _) f = SuccC (trie (λ a → trie (f ∘ (a :<))))
Left-folded vector tries
The change from right-folding to left-folding is minuscule.
instance (IsNat n, HasTrie a) ⇒ HasTrie (Vec n a) where
type Trie (Vec n a) = Trie a :^ n
ZeroC b `untrie` ZVec = b
SuccC t `untrie` (a :< as) = (t `untrie` as) `untrie` a
trie = trieN nat
trieN ∷ HasTrie a ⇒ Nat n → (Vec n a → b) → (Trie a :^ n) b
trieN Zero f = ZeroC (f ZVec)
trieN (Succ _) f = SuccC (trie (λ as → trie (f ∘ (:< as))))
Right vs Left?
There are two tries for Vec n a
, but with Haskell we have to choose one as the HasTrie
instance. How can we make our choice?
The same sort of question arises earlier. The post Reverse-engineering length-typed vectors showed how to discover Vec n
by looking for the trie functor for the BNat n
. The derivation was
Vec n a ≅ a × ⋯ × a
≅ (1 → a) × ⋯ × (1 → a)
≅ (1 + ⋯ + 1) → a
The question of associativity arises here as well, for both product and sum. The BNat n a
and Vec n
types both choose right-associativity:
data BNat ∷ * → * where
BZero ∷ BNat (S n)
BSucc ∷ BNat n → BNat (S n)
data Vec ∷ * → * → * where
ZVec ∷ Vec Z a
(:<) ∷ a → Vec n a → Vec (S n) a
Since trie construction is type-driven, I see the vector trie based on right-folded composition as in line with these definitions. Left-folding would come from a small initial change, swapping the constructors in the BNat
definition:
data BNat ∷ * → * where
BSucc ∷ BNat n → BNat (S n)
BZero ∷ BNat (S n)
From this tweaked beginning, a BNat
trie construction would lead to a left-associated Vec
type:
data Vec ∷ * → * → * where
(:<) ∷ Vec n a → a → Vec (S n) a
ZVec ∷ Vec Z a
And then left-folded composition for the Vec
trie.
Conal Elliott » Blog Archive » Deriving parallel tree scans:
[…] to extend the derivation to depth-typed, perfectly balanced trees, of the sort I played with in A trie for length-typed vectors and From tries to trees. The functions initTs and tailTs make unbalanced trees out of balanced […]
1 March 2011, 12:41 pmConal Elliott » Blog Archive » Composable parallel scanning:
[…] of inits and tails on lists) to depth-typed, perfectly balanced trees, of the sort I played with in A trie for length-typed vectors and From tries to trees. The difficulty I encounter is that the functions initTs and tailTs make […]
2 March 2011, 11:24 amConal Elliott » Blog Archive » A third view on trees:
[…] Alternatively, make the tree sizes explicit in the types, as in a few recent posts, including A trie for length-typed vectors. (In those posts, I used the terms "right-folded" and "left-folded" in place of […]
4 June 2011, 1:35 pm