Conal Elliott

Fixing lists

30th January 2011, 10:14 am

In the post Memoizing polymorphic functions via unmemoization, I toyed with the idea of lists as tries. I don’t think [a] is a trie, simply because [a] is a sum type (being either nil or a cons), while tries are built out of the identity, product, and composition functors. In contrast, Stream is a trie, being built solely with the identity and product functors. Moreover, Stream is not just any old trie, it is the trie that corresponds to Peano (unary natural) numbers, i.e., Stream a ≅ N → a, where

data N = Zero | Succ N

data Stream a = Cons a (Stream a)

If we didn't already know the Stream type, we would derive it systematically from N, using standard isomorphisms.

Stream is a trie (over unary numbers), thanks to it having no choice points, i.e., no sums in its construction. However, streams are infinite-only, which is not always what we want. In contrast, lists can be finite, but are not a trie in any sense I understand. In this post, I look at how to fix lists, so they can be finite and yet be a trie, thanks to having no choice points (sums)?

You can find the code for this post and the previous one in a code repository.

Edits:

2011-01-30: Added spoilers warning.
2011-01-30: Pointer to code repository.

Fixing lists

Is there a type of finite lists without choice points (sums)? Yes. There are lots of them. One for each length. Instead of having a single type of lists, have an infinite family of types of n-element lists, one type for each n.

In other words, to fix the problem with lists (trie-unfriendliness), split up the usual list type into subtypes (so to speak), each of which has a fixed length.

I realize I'm changing the question to a simpler one. I hope you'll forgive me and hang in to see where this ride goes.

As a first try, we might use tuples as our fixed-length lists:

type L0 a = ()
type L1 a = (a)
type L2 a = (a,a)
type L3 a = (a,a,a)
⋯

However, we can only write down finitely many such types, and I don't know how we could write any definitions that are polymorphic over length.

What can "polymorphic over length" mean in a setting like Haskell, where polymorphism is over types rather than values. Can we express numbers (for lengths, etc) as types? Yes, as in the previous post, Type-bounded numbers, using a common encoding:

data Z    -- zero
data S n  -- successor

Given these type-level numbers, we can define a data type Vec n a, containing only vectors (fixed lists) of length n and elements of type a. Such vectors can be built up as either the zero-length vector, or by adding an element to an vector of length n to get a vector of length n + 1. I don't know how to define this type as a regular algebraic data type, but it's easy as a generalized algebraic data type (GADT):

infixr 5 :<

data Vec ∷ * → * → * where
  ZVec ∷                Vec Z     a
  (:<) ∷ a → Vec n a → Vec (S n) a

For example,

*Vec> :ty 'z' :< 'o' :< 'm' :< 'g' :< ZVec
'z' :< 'o' :< 'm' :< 'g' :< ZVec ∷ Vec (S (S (S (S Z)))) Char

As desired, Vec is length-typed, covers all (finite) lengths, and allows definition of length-polymorphic functions. For instance, it's easy to map functions over vectors:

instance Functor (Vec n) where
  fmap _ ZVec     = ZVec
  fmap f (a :< u) = f a :< fmap f u

The type of fmap here is (a → b) → Vec n a → Vec n b.

Folding over vectors is also straightforward:

instance Foldable (Vec n) where
  foldr _ b ZVec      = b
  foldr h b (a :< as) = a `h` foldr h b as

Is Vec n an applicative functor as well?

instance Applicative (Vec n) where
  ⋯

We would need

pure ∷ a → Vec n a
(⊛)  ∷ Vec n (a → b) → Vec n a → Vec n b

The (⊛) method can be defined similarly to fmap:

  ZVec      ⊛ ZVec      = ZVec
  (f :< fs) ⊛ (x :< xs) = f x :< (fs ⊛ xs)

Unlike fmap and (⊛), pure doesn't have a vector structure to crawl over. It must create just the right structure anyway. You might enjoy thinking about how to solve this puzzle, which I'll tackle in my next post. (Warning: spoilers in the comments below.)

| Comment (RSS) | Trackback

10 Comments

Maxime Henrion:

I’m curious to know if I found the same solution as you did. Using the FlexibleContexts and FlexibleInstances extensions, I split the Applicative instance into two :
```
instance Applicative (Vec Z) where
  pure _ = Nil
  Nil ⊛ Nil = Nil

instance Applicative (Vec n) ⇒ Applicative (Vec (S n)) where
  pure x = x :< pure x
  (f :< fs) <> (x :< xs) = f x :< (fs <> xs)
```
This seems to work as expected. It is quite nice to see that GHC is now able to infer that both definitions of pure and <*> are total, and doesn't generate any warning using -Wall unlike with the definition of <*> given in your post.
30 January 2011, 12:37 pm
Patai Gergely:

Do you have a solution that doesn’t need FlexibleInstances and FlexibleContexts?
30 January 2011, 1:16 pm
conal:

Patai: yes, I have a solution that needs neither FlexibleInstances nor FlexibleContexts.
30 January 2011, 3:30 pm
conal:

Maxime: Oh! I hadn’t considered splitting the one instance into two. I like the your solution’s simplicity and that it avoids the GHC warnings. I suspect the constraint Applicative (Vec n) would be cumbersome in practice.

I think those warnings are due to a GHC bug.

With the code I gave above, GHC 6.12.3 gives me a warning (with -Wall, which I always use):
```
Warning: Pattern match(es) are non-exhaustive
         In the definition of `<*>':
             Patterns not matched:
                 ZVec (_ :< _)
                 (_ :< _) ZVec
```
Adding the following cases silences the compiler.
```
ZVec `applyV` (_ :< _) = ⊥
(_ :< _) `applyV` ZVec = ⊥
```
In a chat on #haskell, benmachine found that GHC 7.0.1 balks at these extra definitions as ill-typed, but also warns of non-exhaustive patterns when the lines are removed. In that chat, dafis pointed to a relevant ghc bug report.
30 January 2011, 4:05 pm
Maxime Henrion:

conal: Ah, it’s good to know those warnings were actually a bug in GHC, and that it’s already reported. Indeed, the Applicative (Vec n) constraints are cumbersome, and I’m even more curious to see how you’ve managed this
30 January 2011, 4:34 pm
Conal Elliott » Blog Archive » Doing more with length-typed vectors:

[…] About « Fixing lists […]
30 January 2011, 5:16 pm
conal:

Maxime: I just posted my solution in Doing more with length-typed vectors. It’s something of a cheat, since I add another type class to get the job done.
30 January 2011, 5:20 pm
Conal Elliott » Blog Archive » A trie for length-typed vectors:

[…] Fixing lists […]
31 January 2011, 10:21 pm
Conal Elliott » Blog Archive » Reverse-engineering length-typed vectors:

[…] Fixing lists […]
1 February 2011, 9:58 am
Conal Elliott » Blog Archive » From tries to trees:

[…] Fixing lists […]
2 February 2011, 3:11 pm

Fixing lists

Fixing lists

10 Comments

Maxime Henrion:

Patai Gergely:

conal:

conal:

Maxime Henrion:

Conal Elliott » Blog Archive » Doing more with length-typed vectors:

conal:

Conal Elliott » Blog Archive » A trie for length-typed vectors:

Conal Elliott » Blog Archive » Reverse-engineering length-typed vectors:

Conal Elliott » Blog Archive » From tries to trees:

Leave a comment

Recent Posts

Recent Comments

Archives

Tags

Tags

Meta