## Doing more with length-typed vectors

The post Fixing lists defined a (commonly used) type of vectors, whose lengths are determined statically, by type. In `Vec n a`, the length is `n`, and the elements have type `a`, where `n` is a type-encoded unary number, built up from zero and successor (`Z` and `S`).

``infixr 5 :<data Vec ∷ * → * → * where  ZVec ∷                Vec Z     a  (:<) ∷ a → Vec n a → Vec (S n) a``

It was fairly easy to define `foldr` for a `Foldable` instance, `fmap` for `Functor`, and `(⊛)` for `Applicative`. Completing the `Applicative` instance is tricky, however. Unlike `foldr`, `fmap`, and `(⊛)`, `pure` doesn't have a vector structure to crawl over. It must create just the right structure anyway. I left this challenge as a question to amuse readers. In this post, I give a few solutions, including my current favorite.

You can find the code for this post and the two previous ones in a code repository.

### An Applicative instance

As a review, here is our `Functor` instance:

``instance Functor (Vec n) where  fmap _ ZVec     = ZVec  fmap f (a :< u) = f a :< fmap f u``

And part of an `Applicative` instance:

``instance Applicative (Vec n) where  pure a = ??  ZVec      ⊛ ZVec      = ZVec  (f :< fs) ⊛ (x :< xs) = f x :< (fs ⊛ xs)``

For `pure`, recall the troublesome goal signature:

``  pure ∷ a → Vec n a``

There's at least one very good reason this type is problematic. The type `n` is completely unrestricted. There is nothing to require `n` to be a natural number type, rather than `Bool`, `String`, `String → Bool`, etc.

In contrast to this difficulty with `pure`, consider what if `n ≡ String` in the type of `fmap`:

``fmap ∷ (a → b) → Vec n a → Vec n b``

The definition of `Vec` guarantees that that there are no values of type `Vec String a`. So it's vacuously easy to cover that case (with an empty function). Similarly for `(⊛)`.

If we were to somehow define `pure` with the type given above, then `pure ()` would have type `Vec String ()` (among many other types). However, there are no values of that type. Hence, `pure` cannot be defined without restricting `n`.

Since the essential difficulty here is the unrestricted nature of `n`, let's look at restricting it. We'll want to include exactly the types that can arise in constructing `Vec` values, namely `Z`, `S Z`, `S (S Z)`, `S (S (S Z))`, etc.

As a first try, define a class with two instances:

``class IsNat ninstance IsNat Zinstance IsNat n ⇒ IsNat (S n)``

Then change the `Applicative` instance to require `IsNat n`:

``instance IsNat n ⇒ Applicative (Vec n) where  ⋯``

The definition of `(⊛)` given above still type-checks. Well, not quite. Really, the recursive call to `(⊛)` fails to type-check, because the `IsNat` constraint cannot be proved. One solution is to add that constraint to the vector type:

``data Vec ∷ * → * → * where  ZVec ∷ Vec Z a  (:<) ∷ IsNat n ⇒ a → Vec n a → Vec (S n) a``

Another is to break the definition `(⊛)` out into a separate recursion that omits the `IsNat` constraint:

``instance IsNat n ⇒ Applicative (Vec n) where  pure = ???  (⊛)  = applyVapplyV ∷ Vec n (a → b) → Vec n a → Vec n bZVec      `applyV` ZVec      = ZVec(f :< fs) `applyV` (x :< xs) = f x :< (fs `applyV` xs)``

Now, how can we define `pure`? We still don't have enough structure. To get that structure, add a method to `IsNat`. That method could simply be the definition of `pure` that we need.

``class IsNat n where pureN ∷ a → Vec n ainstance IsNat Z                where pureN a = ZVecinstance IsNat n ⇒ IsNat (S n) where pureN a = a :< pureN a``

To get this second instance to type-check, we'll have to add the constraint `IsNat n` to the `(:<)` constructor in `Vec`. Then define `pure = pureN` for `Vec`.

I prefer a variation on this solution. Instead of `pureN`, use a method that can only make vectors of `()`:

``class IsNat n where units ∷ Vec n ()instance            IsNat Z     where units = ZVecinstance IsNat n ⇒ IsNat (S n) where units = () :< units``

Then define

``  pure a = fmap (const a) units``

Neat trick, huh? I got it from Applicative Programming with Effects (section 7).

### Value-typed natural numbers

There's still another way to define `IsNat`, and it's the one I actually use.

Define a type of natural number with matching value & type:

``data Nat ∷ * → * where  Zero ∷                    Nat Z  Succ ∷ IsNat n ⇒ Nat n → Nat (S n)``

Interpret a `Nat` as an `Integer`

``natToZ ∷ Nat n → IntegernatToZ Zero     = 0natToZ (Succ n) = (succ ∘ natToZ) n``

I wrote the second clause strangely to emphasize the following lovely property, which corresponds to a simple commutative diagram:

``natToZ ∘ Succ = succ ∘ natToZ``

This `natToZ` function is handy for showing natural numbers:

``instance Show (Nat n) where show = show ∘ natToZ``

A fun & strange thing about `Nat n` is that it can have at most one inhabitant for any type `n`. We can synthesize that inhabitant via an alternative definition of the `IsNat` class defined (twice) above:

``class IsNat n where nat ∷ Nat ninstance            IsNat Z     where nat = Zeroinstance IsNat n ⇒ IsNat (S n) where nat = Succ nat``

Using this latest version of `IsNat`, we can easily define `units` (and hence `pure` on `Vec n` for `IsNat n`):

``units ∷ IsNat n ⇒ Vec n ()units = unitsN natunitsN ∷ Nat n → Vec n ()unitsN Zero     = ZVecunitsN (Succ n) = () :< unitsN n``

I prefer this latest `IsNat` definition over the previous two, because it relies only on `Nat`, which is simpler and more broadly useful than `Vec`. Examples abound, including improving an recent post, as we'll see now.

### Revisiting type-bounded numbers

The post Type-bounded numbers defined a type `BNat n` of natural numbers less than `n`, which can be used, for instance, as numerical digits in base `n`.

``data BNat ∷ * → * where  BZero ∷          BNat (S n)  BSucc ∷ BNat n → BNat (S n)``

One useful operation is conversion from integer to `BNat n`. This operation had the awkward task of coming up with `BNat` structure. The solution given was to introduce a type class, with instances for `Z` and `S`:

``class HasBNat n where toBNat ∷ Integer → Maybe (BNat n)instance HasBNat Z where toBNat _ = Nothinginstance HasBNat n ⇒ HasBNat (S n) where  toBNat m | m < 1     = Just BZero           | otherwise = fmap BSucc (toBNat (pred m))``

We can instead eliminate the `HasBNat` class and reuse the `IsNat` class, as in the last technique above for defining `units` or `pure`.

``toBNat ∷ ∀ n. IsNat n ⇒ Integer → Maybe (BNat n)toBNat = loop n where  n = nat ∷ Nat n  loop ∷ Nat n' → Integer → Maybe (BNat n')  loop Zero      _ = Nothing  loop (Succ _)  0 = Just BZero  loop (Succ n') m = fmap BSucc (loop n' (pred m))``

First the easy parts: standard definitions in terms of `pure` and `join`:

``instance IsNat n ⇒ Monad (Vec n) where  return  = pure  v >>= f = join (fmap f v)``

The `join` function on `Vec n` is just like `join` for functions and for streams. (Rightly so, considering the principle of type class morphisms.) It uses diagonalization, and one way to think of vector `join` is that it extracts the diagonal of a square matrix.

``join ∷ Vec n (Vec n a) → Vec n ajoin ZVec      = ZVecjoin (v :< vs) = headV v :< join (fmap tailV vs)``

The `headV` and `tailV` functions are like `head` and `tail` but understand lengths:

``headV ∷ Vec (S n) a → aheadV (a :< _) = a``
``tailV ∷ Vec (S n) a → Vec n atailV (_ :< as) = as``

Unlike their list counterparts, `headV` and `tailV` are safe, in that the precondition of non-emptiness is verified statically.

1. #### conal:

Maxime Henrion suggested an alternative to my `Applicative` instance above:

``instance Applicative (Vec Z) where  pure _ = ZVec  ZVec ⊛ ZVec = ZVecinstance Applicative (Vec n) ⇒ Applicative (Vec (S n)) where  pure a = a :< pure a  (f :< fs) ⊛ (a :< as) = f a :< (fs ⊛ as)``

I hadn’t considered splitting the one instance into two. I like the simplicity of this solution, though I suspect the constraint `Applicative (Vec n)` would be cumbersome in practice. A small drawback of this version is that it requires the GHC language extensions `FlexibleInstances` and `FlexibleContexts`.

Continuing on to `Monad`:

``instance Monad (Vec Z) where  return  = pure  v >>= f = joinZ (fmap f v)instance Monad (Vec n) ⇒ Monad (Vec (S n)) where  return  = pure  v >>= f = joinS (fmap f v)``
``joinZ ∷ Vec Z (Vec Z a) → Vec Z ajoinZ ZVec = ZVecjoinS ∷ Monad (Vec n) ⇒ Vec (S n) (Vec (S n) a) → Vec (S n) ajoinS (v :< vs) = headV v :< join (fmap tailV vs)``

In this case, I'm using the general `join` function on monads. The `Monad` instances would be more elegant and perhaps more efficient if `join` were a method on `Monad` as has been proposed.

2. #### Mikael Bung:

Some time ago I tried to find a solution that didn’t require an extra restriction on the context. As far as I can tell the IsNat solution is the best (in the sense that you only need to carry around one context) you can do currently but Maximes solution together with http://hackage.haskell.org/trac/ghc/wiki/KindSystem looks like it would allow the compiler to convince itself that the Applicative instance exists for all possible Vec n.

3. #### Darius Jahandarie:

Or, whenever this gets finished…

```data Nat = Z | S Nat -- automatically lifted to the type/kind level!

data Vec :: Nat -> * -> * where
ZVec :: Vec Z a
(:<) :: a -> Vec n a -> Vec (S n) a

class Pure n where pureN :: a -> Vec n a

instance Pure Z where pureN a = ZVec
instance Pure (S n) where pureN a = a :< pureN a

instance Applicative (Vec n) where
pure a = pureN a
ZVec <*> ZVec = ZVec
(f :< fs) <*> (x :< xs) = f x :< (fs <*> xs)
```

Or something like that 4. #### Conal Elliott » Blog Archive » Reverse-engineering length-typed vectors:

[…] About « Doing more with length-typed vectors […]

5. #### Bas van Dijk:

Nice post Conal!

We should realy invent a name for this pattern where we have a “type level datatype” (like Z and S n), an isomorphic value level GADT (like Nat) and a class that provides an overloaded constructor for this GADT (like IsNat).

I used this same pattern in 3 places already:

I’m even thinking about using some CPP hackery to abstract the pattern.

6. #### Conal Elliott » Blog Archive » A trie for length-typed vectors:

[…] Doing more with length-typed vectors […]

7. #### Eyal Lotem:

Everyone might already know this, but `(<\$) = fmap . const`, so the above:

`pure a = fmap (const a) units`

Can be written:

`pure a = a <\$ units`

or:

`pure = (<\$ units)`

8. #### conal:

Thanks, Eyal. I had forgotten about `(<\$)`.