Conal Elliott » number

Parallel speculative addition via memoization

Conal — Tue, 27 Nov 2012 23:39:42 +0000

I’ve been thinking much more about parallel computation for the last couple of years, especially since starting to work at Tabula a year ago. Until getting into parallelism explicitly, I’d naïvely thought that my pure functional programming style was mostly free of sequential bias. After all, functional programming lacks the implicit accidental dependencies imposed by the imperative model. Now, however, I’m coming to see that designing parallel-friendly algorithms takes attention to minimizing the depth of the remaining, explicit data dependencies.

As an example, consider binary addition, carried out from least to most significant bit (as usual). We can immediately compute the first (least significant) bit of the result, but in order to compute the second bit, we’ll have to know whether or not a carry resulted from the first addition. More generally, the $(n + 1)$ th sum & carry require knowing the $n$ th carry, so this algorithm does not allow parallel execution. Even if we have one processor per bit position, only one processor will be able to work at a time, due to the linear chain of dependencies.

One general technique for improving parallelism is speculation—doing more work than might be needed so that we don’t have to wait to find out exactly what will be needed. In this post, we’ll see a progression of definitions for bitwise addition. We’ll start with a linear-depth chain of carry dependencies and end with logarithmic depth. Moreover, by making careful use of abstraction, these versions will be simply different type specializations of a single polymorphic definition with an extremely terse definition.

A full adder

Let’s start with an adder for two one-bit numbers. Because of the possibility of overflow, the result will be two bits, which I’ll call “sum” and “carry”. So that we can chain these one-bit adders, we’ll also add a carry input.

addB ∷ (Bool,Bool) → Bool → (Bool,Bool)

In the result, the first Bool will be the sum, and the second will be the carry. I’ve curried the carry input to make it stand out from the (other) addends.

There are a few ways to define addB in terms of logic operations. I like the following definition, as it shares a little work between sum & carry:

addB (a,b) cin = (axb ≠ cin, anb ∨ (cin ∧ axb))
 where
   axb = a ≠ b
   anb = a ∧ b

I’m using (≠) on Bool for exclusive or.

A ripple carry adder

Now suppose we have not just two bits, but two sequences of bits, interpreted as binary numbers arranged from least to most significant bit. For simplicity, I’d like to assume that these sequences to have the same length, so rather than taking a pair of bit lists, let’s take a list of bit pairs:

add ∷ [(Bool,Bool)] → Bool → ([Bool],Bool)

To implement add, traverse the list of bit pairs, threading the carries:

add [] c     = ([]  , c)
add (p:ps) c = (s:ss, c'')
 where
   (s ,c' ) = addB p c
   (ss,c'') = add ps c'

State

This add definition contains a familiar pattern. The carry values act as a sort of state that gets updated in a linear (non-branching) way. The State monad captures this pattern of computation:

newtype State s a = State (s → (a,s))

By using State and its Monad instance, we can shorten our add definition. First we’ll need a new full adder definition, tweaked for State:

addB ∷ (Bool,Bool) → State Bool Bool
addB (a,b) = do cin ← get
                put (anb ∨ cin ∧ axb)
                return (axb ≠ cin)
 where
   anb = a ∧ b
   axb = a ≠ b

And then the multi-bit adder:

add ∷ [(Bool,Bool)] → State Bool [Bool]
add []     = return []
add (p:ps) = do s  ← addB p
                ss ← add ps
                return (s:ss)

We don’t really need the Monad interface to define add. The simpler and more general Applicative interface suffices:

add []     = pure []
add (p:ps) = liftA2 (:) (addB p) (add ps)

This pattern also looks familiar. Oh — the Traversable instance for lists makes for a very compact definition:

add = traverse addB

Wow. The definition is now so simple that it doesn’t depend on the specific choice of lists. To find out the most general type add can have (with this definition), remove the type signature, turn off the monomorphism restriction, and see what GHCi has to say:

add ∷ Traversable t ⇒ t (Bool,Bool) → State Bool (t Bool)

This constraint is very lenient. Traversable can be derived automatically for all algebraic data types, including nested/non-regular ones.

For instance,

data Tree a = Leaf a | Branch (Tree a) (Tree a)
  deriving (Functor,Foldable,Traversable)

We can now specialize this general add back to lists:

addLS ∷ [(Bool,Bool)] → State Bool [Bool]
addLS = add

We can also specialize for trees:

addTS ∷ Tree (Bool,Bool) → State Bool (Tree Bool)
addTS = add

Or for depth-typed perfect trees (e.g., as described in From tries to trees):

addTnS ∷ IsNat n ⇒
         T n (Bool,Bool) → State Bool (T n Bool)
addTnS = add

Binary trees are often better than lists for parallelism, because they allow quick recursive splitting and joining. In the case of ripple adders, we don’t really get parallelism, however, because of the single-threaded (linear) nature of State. Can we get around this unfortunate linearization?

Speculation

The linearity of carry propagation interferes with parallel execution even when using a tree representation. The problem is that each addB (full adder) invocation must access the carry out from the previous (immediately less significant) bit position and so must wait for that carry to be computed. Since each bit addition must wait for the previous one to finish, we get linear running time, even with unlimited parallel processing available. If we didn’t have to wait for carries, we could instead get logarithmic running time using the tree representation, since subtrees could be added in parallel.

A way out of this dilemma is to speculatively compute the bit sums for both possibilities, i.e., for carry and no carry. We’ll do more work, but much less waiting.

State memoization

Recall the State definition:

newtype State s a = State (s → (a,s))

Rather than using a function of s, let’s use a table indexed by s. Since s is Bool in our use, a table is simply a uniform pair, so we could replace State Bool a with the following:

newtype BoolStateTable a = BST ((a,Bool), (a,Bool))

Exercise: define Functor, Applicative, and Monad instances for BoolStateTable.

Rather than defining such a specialized type, let’s stand back and consider what’s going on. We’re replacing a function by an isomorphic data type. This replacement is exactly what memoization is about. So let’s define a general memoizing state monad:

newtype StateTrie s a = StateTrie (s ⇰ (a,s))

Note that the definition of memoizing state is nearly identical to State. I’ve simply replaced “→” by “⇰”, i.e., memo tries. For the (simple) source code of StateTrie, see the github project. (Poking around on Hackage, I just found monad-memo, which looks related.)

The full-adder function addB is restricted to State, but unnecessarily so. The most general type is inferred as

addB ∷ MonadState Bool m ⇒ (Bool,Bool) → m Bool

where the MonadState class comes from the mtl package.

With the type-generalized addB, we get a more general type for add as well:

add ∷ (Traversable t, Applicative m, MonadState Bool m) ⇒
      t (Bool,Bool) → m (t Bool)
add = traverse addB

Now we can specialize add to work with memoized state:

addLM ∷ [(Bool,Bool)] → StateTrie Bool [Bool]
addLM = add

addTM ∷ Tree (Bool,Bool) → StateTrie Bool (Tree Bool)
addTM = add

What have we done?

The essential tricks in this post are to (a) boost parallelism by speculative evaluation (an old idea) and (b) express speculation as memoization (new, to me at least). The technique wins for binary addition thanks to the small number of possible states, which then makes memoization (full speculation) affordable.

I’m not suggesting that the code above has impressive parallel execution when compiled under GHC. Perhaps it could with some par and pseq annotations. I haven’t tried. This exploration helps me understand a little of the space of hardware-oriented algorithms.

The conditional sum adder looks quite similar to the development above. It has the twist, however, of speculating carries on blocks of a few bits rather than single bits. It’s astonishingly easy to adapt the development above for such a hybrid scheme, forming traversable structures of sequences of bits:

addH ∷ Tree [(Bool,Bool)] → StateTrie Bool (Tree [Bool])
addH = traverse (fromState ∘ add)

I’m using the adapter fromState so that the inner list additions will use State while the outer tree additions will use StateTrie, thanks to type inference. This adapter memoizes and rewraps the state transition function:

fromState ∷ HasTrie s ⇒ State s a → StateTrie s a
fromState = StateTrie ∘ trie ∘ runState

From tries to trees

Conal — Tue, 01 Feb 2011 18:36:32 +0000

This post is the last of a series of six relating numbers, vectors, and trees, revolving around the themes of static size-typing and memo tries. We’ve seen that length-typed vectors form a trie for bounded numbers, and can handily represent numbers as well. We’ve also seen that n-dimensional vectors themselves have an elegant trie, which is the n-ary composition of the element type’s trie functor:

type VTrie n a = Trie a :^ n

where for any functor f and natural number type n,

f :^ n ≅ f ∘ ⋯ ∘ f  -- (n times)

This final post in the series places this elegant mechanism of n-ary functor composition into a familiar & useful context, namely trees. Again, type-encoded Peano numbers are central. Just as BNat uses these number types to (statically) bound natural numbers (e.g., for a vector index or a numerical digit), and Vec uses number types to capture vector length, we'll next use number types to capture tree depth.

Edits:

2011-02-02: Changes thanks to comments from Sebastian Fischer
- Added note about number representations and leading zeros (without size-typing).
- Added pointer to Memoizing polymorphic functions via unmemoization for derivation of Tree d a ≅ [d] → a.
- Fixed signatures for some Branch variants, bringing type parameter a into parens.
- Clarification about number of VecTree vs pairing constructors in remarks on left- vs right-folded trees.
2011-02-06: Fixed link to From Fast Exponentiation to Square Matrices.

Infinite trees

In the post Memoizing polymorphic functions via unmemoization, I played with a number of container types, looking at which ones are tries over what domain types. I referred to these domain types as "index types" for the container type. One such container was a type of infinite binary trees with values at every node:

data BinTree a = BinTree a (BinTree a) (BinTree a)

By the usual exponent laws, this BinTree functor is (isomorphic to) the functor of tries over a type of binary natural numbers formulated as follows:

data BinNat = Zero | Even BinNat | Odd BinNat

As a variation on this BinTree, we can replace the two subtrees with a pair of subtrees:

data BinTree a = BinTree a (Pair (BinTree a))

Where Pair could be defined as

data Pair a = Pair a a

or, using functor combinators,

type Pair = Id × Id

The reformulation of BinTree leads to a slightly different representation for our index type, a little-endian list of bits:

data BinNat = Zero | NonZero Bool BinNat

or simply

type BinNat = [Bool]

Note that Bool is the index type for Pair (and conversely, Pair is the trie for Bool), which suggests that we play this same trick for all index types and their corresponding trie functors. Generalizing,

data Tree d a = Tree a (d ↛ Tree d a)

where k ↛ v is short for Trie k v, and Trie k is the trie functor associated with the type k. See Elegant memoization with higher-order types.

These generalized trees are indexed by little-endian natural numbers over a "digit" type d:

Tree d a ≅ [d] → a

which is to say that Tree d is a trie for [d]. See Memoizing polymorphic functions via unmemoization for a derivation.

Note that all of these number representations have a serious problem, which is that they distinguish between number representations that differ only by leading zeros. The size-typed versions do not have this problem.

Finite trees

The reason I chose infinite trees was that the finite tree types I knew of have choice-points/alternatives, and so are isomorphic to sums. I don't know of trie-construction techniques that synthesize sums.

Can we design a tree type that is both finite and choice-free? We've already tackled a similar challenge above with lists earlier in previous posts.

In Fixing lists, I wanted to "fix" lists, in the sense of eliminating the choice points in the standard type [a] so that the result could be a trie. Doing so led to the type Vec n a, which appears to have choice points, due to the two constructors ZVec and (:<), but for any given n, at most one constructor is applicable. (For this reason, regular algebraic data types are inadequate.) For handy review,

data Vec ∷ * → * → * where
  ZVec ∷                Vec Z     a
  (:<) ∷ a → Vec n a → Vec (S n) a

Let's try the same trick with trees, fixing depth instead length, to get depth-typed binary trees:

data BinTree ∷ * → * → * where
  Leaf   ∷ a                         → BinTree Z     a
  Branch ∷ BinTree n a → BinTree n a → BinTree (S n) a

Again, we can replace the two subtrees with a single pair of subtrees in the Branch constructor::

  Branch ∷ Pair (BinTree n) a → BinTree (S n) a

Or, recalling that Bool is the index type for Pair:

data BinTree ∷ * → * → * where
  Leaf   ∷                    a → BinTree Z     a
  Branch ∷ (Bool ↛ BinTree n a) → BinTree (S n) a

The use of Bool is rather ad hoc. Its useful property is isomorphism with 1 + 1, whose corresponding trie functor is Id + Id, i.e., Pair. In the post Type-bounded numbers, we saw another, more systematic, type isomorphic to 1 + 1, which is BNat TwoT (i.e., BNat (S (S Z))), which is the type of natural numbers less than two.

  Branch ∷ (BNat TwoT ↛ BinTree n a) → BinTree (S n) a

This replacement suggests a generalization from binary trees to b-ary trees (i.e., having branch factor b).

data VecTree ∷ * → * → * → * where
  Leaf   ∷                        a → VecTree b Z     a
  Branch ∷ (BNat b ↛ VecTree b n a) → VecTree b (S n) a

Recall that the motivation for BNat b was as an index type for Vec b, which is to say that Vec b turned out to be the trie functor for the type BNat b. With this relationship in mind, the Branch type is equivalent to

  Branch ∷ Vec b (VecTree b n a) → VecTree b (S n) a

Unsurprisingly then, a b-ary tree is either a single leaf value or a branch node containing b subtrees. Also, the depth of a leaf is zero, and the depth of a branch node containing b subtrees each of of depth n is n + 1.

We can also generalize this VecTree type by replacing Vec b with an arbitrary functor f:

data FTree ∷ (* → *) → * → * → * where
  Leaf   ∷               a → FTree f Z     a
  Branch ∷ f (FTree f n) a → FTree f (S n) a

type VecTree b = FTree (Vec b)

Better yet, introduce an intermediate generalization, using the property that Vec b ≡ Trie (BNat b):

type TrieTree i = FTree (Trie i)

type VecTree b = TrieTree (BNat b)

With the exception of the most general form (FTree), these trees are also tries.

Generalizing and inverting our trees

The FTree type looks very like another data type that came up above, namely right-folded b-ary functor composition:

data (:^) ∷ (* → *) → * → (* → *) where
  ZeroC ∷             a  → (f :^ Z    ) a
  SuccC ∷ f ((f :^ n) a) → (f :^ (S n)) a

These two types are not just similar; they're identical (different only in naming, i.e., α-equivalent), so we can use f :^ n in place of FTree f n:

type TrieTree i = (:^) (Trie i)

Instead of right-folded functor composition, we could go with left-folded. What difference would it make to our notions of b-ary or binary trees?

First look at (right-folded) BinTree:

data BinTree ∷ * → * → * where
  Leaf   ∷                  a → BinTree Z     a
  Branch ∷ Pair (BinTree n) a → BinTree (S n) a

Equivalently,

data BinTree ∷ * → * → * where
  Leaf   ∷                         a → BinTree Z     a
  Branch ∷ (BNat TwoT ↛ BinTree n a) → BinTree (S n) a

With left-folding, the Branch constructors would be

  Branch ∷ BinTree n (Pair a) → BinTree (S n) a

  Branch ∷ BinTree n (BNat TwoT ↛ a) → BinTree (S n) a

Then (right-folded) VecTree:

data VecTree ∷ * → * → * → * where
  Leaf   ∷                        a → VecTree b Z     a
  Branch ∷ (BNat b ↛ VecTree b n a) → VecTree b (S n) a

Equivalently,

data VecTree ∷ * → * → * → * where
  Leaf   ∷                    a  → VecTree b Z     a
  Branch ∷ Vec b (VecTree b n a) → VecTree b (S n) a

With left-folding:

  Branch ∷ VecTree b n (BNat b ↛ a) → VecTree b (S n) a

  Branch ∷ VecTree b n (Vec b a) → VecTree b (S n) a

In shifting from right- to left-folding, our tree structuring becomes inverted. Now a "b-ary" tree really has only one subtree per branch node, not b subtrees.

For instance, right-folded a binary tree of depth two might look like

Branch (Branch (Leaf 0, Leaf 1), Branch (Leaf 2, Leaf 3))

For readability, I'm using normal pairs instead of 2-vectors or Pair pairs here. In contrast, the corresponding left-folded a binary tree would look like

Branch (Branch (Leaf ((0,1),(2,3))))

Note that the VecTree constructors are in a linear chain, forming an outer shell, and the number of such constructors is the one more than the depth, and hence logarithmic in the number of leaves. The right-folded form has VecTree constructors scattered throughout the tree, and the number of such constructors is exponential in the depth, and hence linear in the number of leaves. (As Sebastian Fischer pointed out, however, the number of pairing constructors is not reduced in the left-folded form.)

For more examples of this sort of inversion, see Chris Okasaki's gem of a paper From Fast Exponentiation to Square Matrices.

What sort of trees do we have?

I pulled a bit of a bait-and-switch above in reformulating trees. The initial infinite tree type had values and branching at every node:

data BinTree a = BinTree a (BinTree a) (BinTree a)

In contrast, the depth-typed trees (whether binary, b-ary, trie-ary, or functor-ary) all have strict separation of leaf nodes from branching nodes.

A conventional, finite binary tree data type might look like

data BinTree a = Leaf a | Branch (Pair (BinTree a))

Its inverted (left-folded) form:

data BinTree a = Leaf a | Branch (BinTree (Pair a))

When we had depth typing, the right- and left-folded forms were equally expressive. They both described "perfect" trees, with each depth consisting entirely of branching or (for the deepest level) entirely of values. (I'm speaking figuratively for the left-folded case, since literally there is no branching.)

Without depth typing, the expressiveness differs significantly. Right-folded trees can be ragged, with leaves occurring at various depths in the same tree. Left-folded binary trees of depth n can only be perfect, even though the depth is determined dynamically, not statically (i.e., not from type).

Dynamically-depthed binary trees generalize to b-ary, trie-ary, and functor-ary versions. In each case, the left-folded versions are much more statically constrained than their right-folded counterparts.

From here

This post is the last of a six-part series on tries and static size-typing in the context of numbers, vectors, and trees. Maybe you're curious where these ideas came from and where they might be going.

I got interested in these relationships while noodling over some imperative, data-parallel programs. I asked one one of my standard questions: What elegant beauty is hiding deep beneath these low-level implementation details. In this case, prominent details include array indices, bit fiddling, and power-of-two restrictions, which led me to play with binary numbers. Moreover, parallel algorithms often use a divide-and-conquer strategy. That strategy hints at balanced binary trees, which then can be indexed by binary numbers (bit sequences). Indexing brought memo tries to mind.

I expect to write soon about some ideas & techniques for deriving low-level, side-effecting, parallel algorithms from semantically simple and elegant specifications.

A trie for length-typed vectors

Conal — Mon, 31 Jan 2011 23:03:48 +0000

As you might have noticed, I’ve been thinking and writing about memo tries lately. I don’t mean to; they just keep coming up.

Memoization is the conversion of functions to data structures. A simple, elegant, and purely functional form of memoization comes from applying three common type isomorphisms, which also correspond to three laws of exponents, familiar from high school math, as noted by Ralf Hinze in his paper Generalizing Generalized Tries.

In Haskell, one can neatly formulate memo tries via an associated functor, Trie, with a convenient synonym "k ↛ v" for Trie k v, as in Elegant memoization with higher-order types. (Note that I’ve changed my pretty-printing from "k :→: v" to "k ↛ v".) The key property is that the data structure encodes (is isomorphic to) a function, i.e.,

k ↛ a ≅ k → a

In most cases, we ignore non-strictness, though there is a delightful solution for memoizing non-strict functions correctly.

My previous four posts explored use of types to statically bound numbers and to determine lengths of vectors.

Just as (infinite-only) streams are the natural trie for unary natural numbers, we saw in Reverse-engineering length-typed vectors that length-typed vectors (one-dimensional arrays) are the natural trie for statically bounded natural numbers.

BNat n ↛ a ≡ Vec n a

and so

BNat n → a ≅ Vec n a

In retrospect, this relationship is completely unsurprising, since a vector of length n is a collection of values, indexed by 0, . . . , n - 1.

In that same post, I noted that vectors are not only a trie for bounded numbers, but when the elements are also bounded numbers, the vectors can also be thought of as numbers. Both the number of digits and the number base are captured statically, in types:

type Digits n b = Vec n (BNat b)

The type parameters n and b here are type-encodigs of unary numbers, i.e., built up from zero and successor (Z and S). For instance, when b ≡ S (S Z), we have n-bit binary numbers.

In this new post, I look at another question of tries and vectors. Given that Vec n is the trie for BNat n, is there also a trie for Vec n?

Edits:

2011-01-31: Switched trie notation to "k ↛ v" to avoid missing character on iPad.

A trie for length-typed vectors

A vector is a trie over bounded numbers. What is a trie over vectors? As always, isomorphisms show us the way.

Vec n a → b ≅ (a × ⋯ × a) → b
             ≅ a → ⋯ → a → b
             ≅ a ↛ ⋯ ↛ a ↛ b
             ≡ Trie a (⋯ (Trie a b)⋯)
             ≅ (Trie a ∘ ⋯ ∘ Trie a) b

So the trie (functor) for Vec n a is the n-ary composition of tries for a:

type VTrie n a = Trie a :^ n

where for any functor f and natural number type n,

f :^ n ≅ f ∘ ⋯ ∘ f  -- (n times)

N-ary functor composition

Since composition is associative, a recursive formulation might naturally fold from the left or from right. (Or perhaps in a balanced tree, to facilitate parallel execution.)

Right-folded composition

Let's look at each fold direction, starting with the right, i.e.,

f :^ Z   ≅ Id
f :^ S n ≅ f ∘ (f :^ n)

Writing as a GADT:

data (:^) ∷ (* → *) → * → (* → *) where
  ZeroC ∷                        a  → (f :^ Z) a
  SuccC ∷ IsNat n ⇒ f ((f :^ n) a) → (f :^ (S n)) a

Functors compose into functors and applicatives into applicatives. (See Applicative Programming with Effects (section 5) and the instance definitions in Semantic editor combinators.) The following definitions arise from the standard instances for binary functor composition.

instance Functor f ⇒ Functor (f :^ n) where
  fmap h (ZeroC a)  = ZeroC (h a)
  fmap h (SuccC fs) = SuccC ((fmap∘fmap) h fs)

instance (IsNat n, Applicative f) ⇒ Applicative (f :^ n) where
  pure = pureN nat
  ZeroC f  ⊛ ZeroC x  = ZeroC (f x)
  SuccC fs ⊛ SuccC xs = SuccC (liftA2 (⊛) fs xs)

pureN ∷ Applicative f ⇒ Nat n → a → (f :^ n) a
pureN Zero     a = ZeroC a
pureN (Succ _) a = SuccC ((pure ∘ pure) a)

More explicitly, the second pure could instead use pureN:

pureN (Succ n) a = SuccC ((pure ∘ pureN n) a)

Some tidier definitions

Using (⊔), there are tidier definitions of fmap and (⊛):

  fmap = inZeroC  ($) ⊔ inSuccC  (fmap∘fmap)

  (⊛)  = inZeroC2 ($) ⊔ inSuccC2 (liftA2 (⊛))

where the new combinators are partial functions that work inside of ZeroC and SuccC.

inZeroC  h (ZeroC a ) = ZeroC (h a )

inSuccC  h (SuccC as) = SuccC (h as)

inZeroC2 h (ZeroC a ) (ZeroC b ) = ZeroC (h a  b )

inSuccC2 h (SuccC as) (SuccC bs) = SuccC (h as bs)

This example demonstrates another notational benefit of (⊔), extending the techniques in the post Lazier function definitions by merging partial values.

Left-folded composition

For left-folded composition, a tiny change suffices in the S case:

f :^ Z   ≅ Id
f :^ S n ≅ (f :^ n) ∘ f

which translates to a correspondingly tiny change in the SuccC constructor.

data (:^) ∷ (* → *) → * → (* → *) where
  ZeroC ∷                        a  → (f :^ Z) a
  SuccC ∷ IsNat n ⇒ (f :^ n) (f a) → (f :^ (S n)) a

The Functor and Applicative instances are completely unchanged.

Vector tries (continued)

Using the analysis above, we can easily define tries over vectors as n-ary composition of tries over the vector element type. Again, there is a right-folded and a left-folded version.

Right-folded vector tries

instance (IsNat n, HasTrie a) ⇒ HasTrie (Vec n a) where
  type Trie (Vec n a) = Trie a :^ n

Conversion from trie to function is, as always, a trie look-up. Its definition closely follows the definition of f :^ n:

  ZeroC v `untrie` ZVec      = v
  SuccC t `untrie` (a :< as) = (t `untrie` a) `untrie` as

For untrie, we were able to follow the zero/successor structure of the trie. For trie, we don't have such a structure to follow, but we can play the same trick as for defining units above. Use the nat method of the IsNat class to synthesize a number of type Nat n, and then follow the structure of that number in a new recursive function definition.

  trie = trieN nat

where

trieN ∷ HasTrie a ⇒ Nat n → (Vec n a → b) → (Trie a :^ n) b
trieN Zero     f = ZeroC (f ZVec)
trieN (Succ _) f = SuccC (trie (λ a → trie (f ∘ (a :<))))

Left-folded vector tries

The change from right-folding to left-folding is minuscule.

instance (IsNat n, HasTrie a) ⇒ HasTrie (Vec n a) where
  type Trie (Vec n a) = Trie a :^ n

  ZeroC b `untrie` ZVec      = b
  SuccC t `untrie` (a :< as) = (t `untrie` as) `untrie` a
  
  trie = trieN nat

trieN ∷ HasTrie a ⇒ Nat n → (Vec n a → b) → (Trie a :^ n) b
trieN Zero     f = ZeroC (f ZVec)
trieN (Succ _) f = SuccC (trie (λ as → trie (f ∘ (:< as))))

Right vs Left?

There are two tries for Vec n a, but with Haskell we have to choose one as the HasTrie instance. How can we make our choice?

The same sort of question arises earlier. The post Reverse-engineering length-typed vectors showed how to discover Vec n by looking for the trie functor for the BNat n. The derivation was

Vec n a ≅ a × ⋯ × a
        ≅ (1 → a) × ⋯ × (1 → a)
        ≅ (1 + ⋯ + 1) → a

The question of associativity arises here as well, for both product and sum. The BNat n a and Vec n types both choose right-associativity:

data BNat ∷ * → * where
  BZero ∷          BNat (S n)
  BSucc ∷ BNat n → BNat (S n)

data Vec ∷ * → * → * where
  ZVec ∷                Vec Z     a
  (:<) ∷ a → Vec n a → Vec (S n) a

Since trie construction is type-driven, I see the vector trie based on right-folded composition as in line with these definitions. Left-folding would come from a small initial change, swapping the constructors in the BNat definition:

data BNat ∷ * → * where
  BSucc ∷ BNat n → BNat (S n)
  BZero ∷          BNat (S n)

From this tweaked beginning, a BNat trie construction would lead to a left-associated Vec type:

data Vec ∷ * → * → * where
  (:<) ∷ Vec n a → a → Vec (S n) a
  ZVec ∷                Vec Z     a

And then left-folded composition for the Vec trie.

Reverse-engineering length-typed vectors

Conal — Mon, 31 Jan 2011 17:52:56 +0000

The last few posts posts followed a winding path toward a formulation of a type for length-typed vectors. In Fixing lists, I mused how something like lists could be a trie type. The Stream functor (necessarily infinite lists) is the natural trie for Peano numbers. The standard list functor [] (possibly finite lists) doesn’t seem to be a trie, since it’s built from sums. However, the functor Vec n of vectors ("fixed lists") of length n is built from (isomorphic to) products only (for any given n), and so might well be a trie.

Of what type is Vec n the corresponding trie? In other words, for what type q is Vec n a isomorphic to q → a (for all a).

Turning this question on its head, what simpler type gives rise to length-typed vectors in a standard fashion?

Edits:

2011-02-01: Define Digits n b as BNat n ↛ BNat b.

Deriving vectors

Recalling that Vec n a is an n-ary product of the type a, and forming the sort of derivation I used in Memoizing polymorphic functions via unmemoization,

Vec n a ≅ a × ⋯ × a
        ≅ (1 → a) × ⋯ × (1 → a)
        ≅ (1 + ⋯ + 1) → a

where the type 1 is what we call "unit" or "()" in Haskell, having exactly one (non-bottom) value. I used the isomorphism (a + b) → c ≅ (a → c) × (b → c), which corresponds to a familiar law of exponents: c^a + b = c^a × c^b.

So the sought domain type q is any type isomorphic to 1 + ⋯ + 1, which is to say a type consisting of exactly n choices.

We have already seen a candidate for the index type q, of which Vec n is the natural trie functor. The post Type-bounded numbers defines a type BNat n corresponding to natural numbers less than n, where n is a type-encoded natural number.

data BNat ∷ * → * where
  BZero ∷          BNat (S n)
  BSucc ∷ BNat n → BNat (S n)

These two constructors correspond to two axioms about inequality: 0 < n + 1 and m < n ⇒ m + 1 < n + 1. The type BNat n then corresponds to canonoical proofs that m < n for various values of m, where the proofs are built out of the two axioms and the law of modus ponens (which corresponds to function application).

Assuming a type n is built up solely from Z and S, a simple inductive argument shows that the number of fully-defined elements (not containing ⊥) of BNat n is the natural number corresponding to n (i.e., nat ∷ Nat n, where nat is as in the post Doing more with length-typed vectors).

Vectors as numbers

We've seen that Vec is a trie for bounded unary numbers, i.e., BNat n ↛ a ≡ Vec n a, using the notation from Elegant memoization with higher-order types. (Note that I've changed my pretty-printing from "k :→: v" to "k ↛ v".)

It's also the case that a vector of digits can be used to represent numbers:

type Digits n b = Vec n (BNat b)  -- n-digit number in base b

Or, more pleasing to my eye,

type Digits n b = BNat n ↛ BNat b

These representations can be given a little-endian or big-endian interpretation:

littleEndianToZ, bigEndianToZ ∷ ∀ n b. IsNat b ⇒ Digits n b → Integer

littleEndianToZ = foldr' (λ x s → fromBNat x + b * s) 0
 where b = natToZ (nat ∷ Nat b)

bigEndianToZ    = foldl' (λ s x → fromBNat x + b * s) 0
 where b = natToZ (nat ∷ Nat b)

The foldl' and foldr' are from Data.Foldable.

Give it a try:

*Vec> let ds = map toBNat [3,5,7] ∷ [BNat TenT]
*Vec> let v3 = fromList ds ∷ Vec ThreeT (BNat TenT)
*Vec> v3
fromList [3,5,7]
*Vec> littleEndianToZ v3
753
*Vec> bigEndianToZ v3
357

It's a shame here to map to the unconstrained Integer type, since (a) the result must be a natural number, and (b) the result is statically bounded by bⁿ.

Doing more with length-typed vectors

Conal — Mon, 31 Jan 2011 01:16:09 +0000

The post Fixing lists defined a (commonly used) type of vectors, whose lengths are determined statically, by type. In Vec n a, the length is n, and the elements have type a, where n is a type-encoded unary number, built up from zero and successor (Z and S).

infixr 5 :<

data Vec ∷ * → * → * where
  ZVec ∷                Vec Z     a
  (:<) ∷ a → Vec n a → Vec (S n) a

It was fairly easy to define foldr for a Foldable instance, fmap for Functor, and (⊛) for Applicative. Completing the Applicative instance is tricky, however. Unlike foldr, fmap, and (⊛), pure doesn't have a vector structure to crawl over. It must create just the right structure anyway. I left this challenge as a question to amuse readers. In this post, I give a few solutions, including my current favorite.

You can find the code for this post and the two previous ones in a code repository.

An Applicative instance

As a review, here is our Functor instance:

instance Functor (Vec n) where
  fmap _ ZVec     = ZVec
  fmap f (a :< u) = f a :< fmap f u

And part of an Applicative instance:

instance Applicative (Vec n) where
  pure a = ??
  ZVec      ⊛ ZVec      = ZVec
  (f :< fs) ⊛ (x :< xs) = f x :< (fs ⊛ xs)

For pure, recall the troublesome goal signature:

  pure ∷ a → Vec n a

There's at least one very good reason this type is problematic. The type n is completely unrestricted. There is nothing to require n to be a natural number type, rather than Bool, String, String → Bool, etc.

In contrast to this difficulty with pure, consider what if n ≡ String in the type of fmap:

fmap ∷ (a → b) → Vec n a → Vec n b

The definition of Vec guarantees that that there are no values of type Vec String a. So it's vacuously easy to cover that case (with an empty function). Similarly for (⊛).

If we were to somehow define pure with the type given above, then pure () would have type Vec String () (among many other types). However, there are no values of that type. Hence, pure cannot be defined without restricting n.

Since the essential difficulty here is the unrestricted nature of n, let's look at restricting it. We'll want to include exactly the types that can arise in constructing Vec values, namely Z, S Z, S (S Z), S (S (S Z)), etc.

As a first try, define a class with two instances:

class IsNat n

instance IsNat Z
instance IsNat n ⇒ IsNat (S n)

Then change the Applicative instance to require IsNat n:

instance IsNat n ⇒ Applicative (Vec n) where
  ⋯

The definition of (⊛) given above still type-checks. Well, not quite. Really, the recursive call to (⊛) fails to type-check, because the IsNat constraint cannot be proved. One solution is to add that constraint to the vector type:

data Vec ∷ * → * → * where
  ZVec ∷ Vec Z a
  (:<) ∷ IsNat n ⇒ a → Vec n a → Vec (S n) a

Another is to break the definition (⊛) out into a separate recursion that omits the IsNat constraint:

instance IsNat n ⇒ Applicative (Vec n) where
  pure = ???
  (⊛)  = applyV

applyV ∷ Vec n (a → b) → Vec n a → Vec n b
ZVec      `applyV` ZVec      = ZVec
(f :< fs) `applyV` (x :< xs) = f x :< (fs `applyV` xs)

Now, how can we define pure? We still don't have enough structure. To get that structure, add a method to IsNat. That method could simply be the definition of pure that we need.

class IsNat n where pureN ∷ a → Vec n a

instance IsNat Z                where pureN a = ZVec
instance IsNat n ⇒ IsNat (S n) where pureN a = a :< pureN a

To get this second instance to type-check, we'll have to add the constraint IsNat n to the (:<) constructor in Vec. Then define pure = pureN for Vec.

I prefer a variation on this solution. Instead of pureN, use a method that can only make vectors of ():

class IsNat n where units ∷ Vec n ()

instance            IsNat Z     where units = ZVec
instance IsNat n ⇒ IsNat (S n) where units = () :< units

Then define

  pure a = fmap (const a) units

Neat trick, huh? I got it from Applicative Programming with Effects (section 7).

Value-typed natural numbers

There's still another way to define IsNat, and it's the one I actually use.

Define a type of natural number with matching value & type:

data Nat ∷ * → * where
  Zero ∷                    Nat Z
  Succ ∷ IsNat n ⇒ Nat n → Nat (S n)

Interpret a Nat as an Integer

natToZ ∷ Nat n → Integer
natToZ Zero     = 0
natToZ (Succ n) = (succ ∘ natToZ) n

I wrote the second clause strangely to emphasize the following lovely property, which corresponds to a simple commutative diagram:

natToZ ∘ Succ = succ ∘ natToZ

This natToZ function is handy for showing natural numbers:

instance Show (Nat n) where show = show ∘ natToZ

A fun & strange thing about Nat n is that it can have at most one inhabitant for any type n. We can synthesize that inhabitant via an alternative definition of the IsNat class defined (twice) above:

class IsNat n where nat ∷ Nat n

instance            IsNat Z     where nat = Zero
instance IsNat n ⇒ IsNat (S n) where nat = Succ nat

Using this latest version of IsNat, we can easily define units (and hence pure on Vec n for IsNat n):

units ∷ IsNat n ⇒ Vec n ()
units = unitsN nat

unitsN ∷ Nat n → Vec n ()
unitsN Zero     = ZVec
unitsN (Succ n) = () :< unitsN n

I prefer this latest IsNat definition over the previous two, because it relies only on Nat, which is simpler and more broadly useful than Vec. Examples abound, including improving an recent post, as we'll see now.

Revisiting type-bounded numbers

The post Type-bounded numbers defined a type BNat n of natural numbers less than n, which can be used, for instance, as numerical digits in base n.

data BNat ∷ * → * where
  BZero ∷          BNat (S n)
  BSucc ∷ BNat n → BNat (S n)

One useful operation is conversion from integer to BNat n. This operation had the awkward task of coming up with BNat structure. The solution given was to introduce a type class, with instances for Z and S:

class HasBNat n where toBNat ∷ Integer → Maybe (BNat n)

instance HasBNat Z where toBNat _ = Nothing

instance HasBNat n ⇒ HasBNat (S n) where
  toBNat m | m < 1     = Just BZero
           | otherwise = fmap BSucc (toBNat (pred m))

We can instead eliminate the HasBNat class and reuse the IsNat class, as in the last technique above for defining units or pure.

toBNat ∷ ∀ n. IsNat n ⇒ Integer → Maybe (BNat n)
toBNat = loop n where
  n = nat ∷ Nat n
  loop ∷ Nat n' → Integer → Maybe (BNat n')
  loop Zero      _ = Nothing
  loop (Succ _)  0 = Just BZero
  loop (Succ n') m = fmap BSucc (loop n' (pred m))

A Monad instance

First the easy parts: standard definitions in terms of pure and join:

instance IsNat n ⇒ Monad (Vec n) where
  return  = pure
  v >>= f = join (fmap f v)

The join function on Vec n is just like join for functions and for streams. (Rightly so, considering the principle of type class morphisms.) It uses diagonalization, and one way to think of vector join is that it extracts the diagonal of a square matrix.

join ∷ Vec n (Vec n a) → Vec n a
join ZVec      = ZVec
join (v :< vs) = headV v :< join (fmap tailV vs)

The headV and tailV functions are like head and tail but understand lengths:

headV ∷ Vec (S n) a → a
headV (a :< _) = a

tailV ∷ Vec (S n) a → Vec n a
tailV (_ :< as) = as

Unlike their list counterparts, headV and tailV are safe, in that the precondition of non-emptiness is verified statically.

Fixing lists

Conal — Sun, 30 Jan 2011 18:14:30 +0000

In the post Memoizing polymorphic functions via unmemoization, I toyed with the idea of lists as tries. I don’t think [a] is a trie, simply because [a] is a sum type (being either nil or a cons), while tries are built out of the identity, product, and composition functors. In contrast, Stream is a trie, being built solely with the identity and product functors. Moreover, Stream is not just any old trie, it is the trie that corresponds to Peano (unary natural) numbers, i.e., Stream a ≅ N → a, where

data N = Zero | Succ N

data Stream a = Cons a (Stream a)

If we didn't already know the Stream type, we would derive it systematically from N, using standard isomorphisms.

Stream is a trie (over unary numbers), thanks to it having no choice points, i.e., no sums in its construction. However, streams are infinite-only, which is not always what we want. In contrast, lists can be finite, but are not a trie in any sense I understand. In this post, I look at how to fix lists, so they can be finite and yet be a trie, thanks to having no choice points (sums)?

You can find the code for this post and the previous one in a code repository.

Edits:

2011-01-30: Added spoilers warning.
2011-01-30: Pointer to code repository.

Fixing lists

Is there a type of finite lists without choice points (sums)? Yes. There are lots of them. One for each length. Instead of having a single type of lists, have an infinite family of types of n-element lists, one type for each n.

In other words, to fix the problem with lists (trie-unfriendliness), split up the usual list type into subtypes (so to speak), each of which has a fixed length.

I realize I'm changing the question to a simpler one. I hope you'll forgive me and hang in to see where this ride goes.

As a first try, we might use tuples as our fixed-length lists:

type L0 a = ()
type L1 a = (a)
type L2 a = (a,a)
type L3 a = (a,a,a)
⋯

However, we can only write down finitely many such types, and I don't know how we could write any definitions that are polymorphic over length.

What can "polymorphic over length" mean in a setting like Haskell, where polymorphism is over types rather than values. Can we express numbers (for lengths, etc) as types? Yes, as in the previous post, Type-bounded numbers, using a common encoding:

data Z    -- zero
data S n  -- successor

Given these type-level numbers, we can define a data type Vec n a, containing only vectors (fixed lists) of length n and elements of type a. Such vectors can be built up as either the zero-length vector, or by adding an element to an vector of length n to get a vector of length n + 1. I don't know how to define this type as a regular algebraic data type, but it's easy as a generalized algebraic data type (GADT):

infixr 5 :<

data Vec ∷ * → * → * where
  ZVec ∷                Vec Z     a
  (:<) ∷ a → Vec n a → Vec (S n) a

For example,

*Vec> :ty 'z' :< 'o' :< 'm' :< 'g' :< ZVec
'z' :< 'o' :< 'm' :< 'g' :< ZVec ∷ Vec (S (S (S (S Z)))) Char

As desired, Vec is length-typed, covers all (finite) lengths, and allows definition of length-polymorphic functions. For instance, it's easy to map functions over vectors:

instance Functor (Vec n) where
  fmap _ ZVec     = ZVec
  fmap f (a :< u) = f a :< fmap f u

The type of fmap here is (a → b) → Vec n a → Vec n b.

Folding over vectors is also straightforward:

instance Foldable (Vec n) where
  foldr _ b ZVec      = b
  foldr h b (a :< as) = a `h` foldr h b as

Is Vec n an applicative functor as well?

instance Applicative (Vec n) where
  ⋯

We would need

pure ∷ a → Vec n a
(⊛)  ∷ Vec n (a → b) → Vec n a → Vec n b

The (⊛) method can be defined similarly to fmap:

  ZVec      ⊛ ZVec      = ZVec
  (f :< fs) ⊛ (x :< xs) = f x :< (fs ⊛ xs)

Unlike fmap and (⊛), pure doesn't have a vector structure to crawl over. It must create just the right structure anyway. You might enjoy thinking about how to solve this puzzle, which I'll tackle in my next post. (Warning: spoilers in the comments below.)

Type-bounded numbers

Conal — Sat, 29 Jan 2011 17:53:57 +0000

I’ve been thinking a lot lately about how to derive low-level massively parallel programs from high-level specifications. One of the recurrent tools is folds (reductions) with an associative operator. Associativity allows a linear chain of computations to be restructured into a tree, exposing parallelism. I’ll write up some of my thoughts on deriving parallel programs, but first I’d like to share a few fun ideas I’ve encountered, relating natural numbers (represented in various bases), vectors (one-dimensional arrays), and trees. This material got rather long for a single blog post, so I’ve broken it up into six. A theme throughout will be using types to capture the sizes of the numbers, vectors, and trees.

In writing this series, I wanted to explore an idea for how binary numbers can emerge from simpler and/or more universal notions. And how trees can likewise emerge from binary numbers.

Let’s start with unary (Peano) natural numbers:

data Unary = Zero | Succ Unary

You might notice a similarity with the list type, which could be written as follows:

data List a = Nil | Cons a (List a)

or with a bit of renaming:

data [a] = [] | a : [a]

Specializing a to (), we could just as well have define Unary as a list of unit values:

type Unary = [()]

Though only if we're willing to ignore bottom elements (i.e., ⊥ ∷ ()).

Suppose, however, that we don't want to use unary. We could define and use a type for binary natural numbers. A binary number is either zero, or a zero bit followed by a binary number, or a one bit followed by a binary number:

data Binary = Zero | ZeroAnd Binary | OneAnd Binary

Alternatively, combine the latter two cases into one, making the bit type explicit:

data Binary = Zero | NonZero Bit Binary

Equivalently,

type Binary = [Bit]

We could define the Bit type as a synonym for Bool or as its own distinct, two-valued data type.

Next, how about ternary numbers, decimal numbers, etc? Rather than defining an ad hoc collection of data types, how might we define a single general type of n-ary numbers?

You can find the code for this post in a code repository.

Edits:

2011-01-30: Example of finding the natural numbers greater than a given one
2011-01-30: Equality and comparison
2011-01-30: More section headings
2011-01-30: Mention of correspondence to commutative diagram
2011-01-30: Pointer to code repository.

First try

As a first crack, we might replace the () and Bit types above with a general Digit type, defined as an integer:

type Digit = Integer

type Nary = [Digit]

We could then define operations to convert between Integer and Nary, and to perform arithmetic operation on Nary.

This first approach has some drawbacks:

In each case, one parameter would have to be the number base.
One can accidentally produce a number as if in one base and then consume as if in another.
One can accidentally add numbers in different bases.
For base n, every digit is required to be in the range 0, ..., n - 1. This constraint is not enforced statically, and so would either have to be checked dynamically (costing time and code) or could be accidentally broken.

I'd rather have the number bases be statically apparent and statically checked.

What I'm looking for is a type of bounded natural numbers, where BNat n consists of values that correspond to 0, ..., n - 1. Then

type Nary n = [BNat n]

Since the base is now part of the static type, it does not have to be passed in explicitly (and perhaps incorrectly), and it cannot be used inconsistently.

But wait a minute! We don't have dependent types (i.e., types that depend on values) in Haskell, so what could I mean by "n" in "BNat n"?

Type-level natural numbers

A by-now-common trick is to build type-level unary natural numbers out of two data types.

data Z    -- zero
data S n  -- successor

We won't use values of these types, so there are no corresponding value constructors.

Some handy aliases:

type ZeroT = Z
type OneT  = S ZeroT
type TwoT  = S OneT
⋯
type TenT  = S NineT

For instance, FourT is the type S (S (S (S Z))).

Type-bounded unary numbers

We want our type BNat n to consist of (values corresponding to) 0, . . . , n - 1. To get an inductive perspective, note that (a) BNat Z is empty; and (b) an element of BNat (S n) is either 0, or it is one more than a number in the range 0, . . . , n - 1, i.e., an element of BNat n. These two possibilities lead directly to a representation for BNat, as a GADT (generalized algebraic data type):

data BNat ∷ * → * where
  BZero ∷          BNat (S n)
  BSucc ∷ BNat n → BNat (S n)

Conversion to and from integers

These two constructors correspond to two facts about inequality: 0 < n + 1, and m < n ⇒ m + 1 < n + 1. Elements of the type BNat n then correspond to canonoical proofs that m < n for various values of m, where the proofs are built out of the two facts and modus ponens (i.e., ((P ⇒ Q) ∧ P) ⇒ Q, which corresponds to function application).

We can extract the m of these proofs:

fromBNat ∷ BNat n → Integer
fromBNat BZero     = 0
fromBNat (BSucc n) = (succ ∘ fromBNat) n

I wrote the second clause strangely to emphasize the following lovely property, corresponding to a commutative diagram:

fromBNat ∘ BSucc = succ ∘ fromBNat

Note that the type of fromBNat may be generalized:

fromBNat ∷ (Enum a, Num a) ⇒ BNat n → a

To present BNat values, convert them to integers:

instance Show (BNat n) where show = show ∘ fromBNat

The reverse mapping is handy also, i.e., for a number type n, given an integer m, generate a proof that m < n, or fail if m ≥ n.

toBNat ∷ Integer → Maybe (BNat n)

Unlike fromBNat, toBNat doesn't have a structure to crawl over. It must create just the right structure anyway. What can we do?

One solution is to use a type class: with instances for Z and S:

class HasBNat n where toBNat ∷ Integer → Maybe (BNat n)

instance HasBNat Z where toBNat _ = Nothing

instance HasBNat n ⇒ HasBNat (S n) where
  toBNat m | m < 1     = Just BZero
           | otherwise = fmap BSucc (toBNat (pred m))

*BNat> toBNat  3 ∷ Maybe (BNat EightT)
Just 3
*BNat> toBNat 10 ∷ Maybe (BNat EightT)
Nothing

Later blog posts will include another solution for toBNat as well as some applications of BNat.

We can also get a description of all natural numbers greater than a given one:

*BNat> :ty BSucc (BSucc BZero)
BSucc (BSucc BZero) ∷ BNat (S (S (S n)))

In words, the natural numbers greater than two are exactly those of the form 3 + n, for natural numbers n.

Equality and comparison

Equality and ordering are easily defined, all based on simple properties of numbers:

instance Eq (BNat n) where
  BZero   ≡ BZero    = True
  BSucc m ≡ BSucc m' = m ≡ m'
  _       ≡ _        = False

instance Ord (BNat n) where
  BZero   `compare` BZero    = EQ
  BSucc m `compare` BSucc m' = m `compare` m'
  BZero   `compare` BSucc _  = LT
  BSucc _ `compare` BZero    = GT

Adding numbers

Conal — Mon, 25 Oct 2010 20:57:41 +0000

Introduction

I’m starting to think about exact numeric computation. As a first step in getting into issues, I’ve been playing with addition on number representations, particularly carry look-ahead adders.

This post plays with adding numbers and explores a few variations, beginning with the standard algorithm I learned as a child, namely working from right to left (least to most significant), propagating carries. For fun & curiosity, I also try out a pseudo-parallel version using circular programming, as well as a state-monad formulation. Each of these variations has its own elegance.

While familiar and simple, right-to-left algorithms have a fundamental limitation. Since they begin with the least significant digit, they cannot be applied numbers that have infinitely many decreasingly significant digits. To add exact real numbers, we’ll need a different algorithm.

Given clear formulations of right-to-left addition, and with exact real addition in mind, I was curious about left-to-right addition. The circular formulation adapts straightforwardly. Delightfully, the monadic version adapts even more easily, by replacing the usual state monad with the backward state monad.

To exploit the right-to-left algorithms in exact real addition, I had to tweak the single-digit addition step to be a bit laxer (less strict). With this change, infinite-digit addition works just fine.

Full adders

In AddingMachines.hs, define a full adder, which takes two values to add and a carry flag, and produces a sum with a carry flag:

type Adder a = a → a → Bool → (a, Bool)

Define a single-digit full adder, for a given base:

addBase ∷ (Num a, Ord a) => a → Adder a
addBase base a b carry | sum' < base = (sum', False)
                       | otherwise   = (sum'-base, True)
  where
    sum' = a + b + if carry then 1 else 0

For the examples below, I’ll specialize to base 10:

add10 ∷ Adder Int
add10 = addBase 10

Then string together (full) adders to make multi-digit adders:

adds ∷ Adder a → Adder [a]

The digits are in little-endian order, i.e., least to most significant, which is the order in which I was taught to add.

Explicit carry threading

How to string together the carries? For simplicity, require that the two digit lists have the same length. As a first implementation, let’s thread the carries through manually:

adds ∷ Adder a → Adder [a]
adds _ [] [] i = ([],i)
adds add (a:as) (b:bs) i = (c:cs,o')
  where
    (c ,o ) = add a b i
    (cs,o') = adds add as bs o
adds _ _ _ _ = error "adds: differing number of digits"

In this definition and throughout this post, I’ll use the names “i” and “o” for incoming and outgoing carry flags, respectively.

Try it:

*AddingMachines> adds add10 [3,5,7,8] [1,6,4,1] False
([4,1,2,0],True)

Pseudo-parallel carries

Here’s an idea for a more elegant approach: do all of the additions in parallel, with the list of carries coming in. The input carries come from the outputs of the additions, shifted by one position, resulting in a circular program.

addsP add as bs i = (cs,last is)
 where
   (cs,os) = unzip (zipWith3 add as bs is)
   is = i : os

Note the mutual recursion in the two local definitions. I’m relying on the last element of is being dropped by zipWith3. I could instead pass in init is, but when I do so, no digits get out. I think the reason has to do with a subtlety in the definition of init.

Try it:

*AddingMachines> addsP add10 [3,5,7,8] [1,6,4,1] False
([4,1,2,0],True)

What makes addsP productive, i.e., what allows us to get information out of this circular definition? I think the key to productivity in addsP is that the first element of is is available before anything at all is known about os, and then the second element of is is ready when only the first element of os is knowable, etc.

State monad

The explicit threading done in the first adds definition above is just the sort of thing that the State monad takes care of. In StateAdd.hs, define a carrier monad to be State with a boolean carry flag as state:

type Carrier = State Bool

Then tweak the Adder type:

type Adder a = a → a → Carrier a

For single-digit addition, just wrap the previous version (imported qualified as AM):

addBase ∷ (Ord a, Num a) => a → Adder a
addBase base a b = State (AM.addBase base a b)

Or, using semantic editor combinators,

addBase = (result.result.result) State AM.addBase

A big win with the Carrier monad is that zipWithM handles carry-propagation exactly as needed for multi-digit addition:

adds ∷ Adder a → Adder [a]
adds = zipWithM

Try it:

*StateAdd> runState (adds add10 [3,5,7,8] [1,6,4,1]) False
([4,1,2,0],True)

Addition in reverse

So far, we’re adding digits in the standard direction: from least to most significant. This order makes it easy to propagate carries in the explicit-threading and the state monad formulations of multi-digit addition. However, it also has a potentially serious drawback. Suppose a computation depends on an approximation to the sum of two numbers, e.g., the only the first most significant digits. Then the unnecessary less-significant digits will all have to be computed anyway.

One extreme and important example of this drawback is when there are infinitely many digits of diminishing significance, which is the case with exact reals, when our digits are past the radix point. The algorithms above cannot even represent such numbers, since the digit lists are from least to most significant.

Let’s reverse the order of digits in our number representations, so they run from most to least significant. With this reversal, we can easily represent numbers with infinitely many decreasingly significant digits. Carrying, however, becomes trickier. As a first try, here’s an explicitly threaded multi-digit adder:

addsR ∷ Adder a → Adder [a]
addsR _ [] [] i = ([],i)
addsR add (a:as) (b:bs) i = (c:cs,o')
  where
    (c,o') = add a b o
    (cs,o) = addsR add as bs i
addsR _ _ _ _ = error "adds: differing number of digits"

The only difference from the forward-cary adds is in the local definitions, which propagate the carry. From the original version:

    (c ,o ) = add a b i
    (cs,o') = adds add as bs o

With this change, carries now propagate in the reverse order. To remind us of the changed direction, I’ll swap the digits & carry flag in testing.

*AddingMachines> swap $ addsR add10 [3,5,7,8] [1,6,4,1] False
(False,[5,2,1,9])
*AddingMachines> swap $ addsR add10 [8,7,5,3] [1,4,6,1] False
(True,[0,2,1,4])

where

swap ∷ (a,b) → (b,a)
swap (a,b) = (b,a)

Lax addition

Let’s now see how lax our definitions are. (Reminder: laxness is the opposite of strictness and is sometimes confused with the operational notion of laziness.)

Back to our original example, using forward carrying (from least to most significant):

*AddingMachines> adds add10 [3,5,7,8] [1,6,4,1] False
([4,1,2,0],True)

Now replace the third digit of one number with ⊥. There is still enough information to compute the two least significant digits:

*AddingMachines> adds add10 [3,5,⊥,8] [1,6,4,1] False
([4,1,*** Exception: Prelude.⊥

If we reverse the digits and try again, we run into trouble at the outset, with both the carry and the digits.

*AddingMachines> let q = swap $ addsR add10 [8,⊥,5,3] [1,4,6,1] False
*AddingMachines> first q
*** Exception: Prelude.⊥
*AddingMachines> snd q
[*** Exception: Prelude.⊥

Closer examination shows that the two least significant digits are still computed (as before):

*AddingMachines> snd q !! 0
*** Exception: Prelude.⊥
*AddingMachines> snd q !! 1
*** Exception: Prelude.⊥
*AddingMachines> snd q !! 2
1
*AddingMachines> snd q !! 3
4

In this example, neither of the most significant two digits can be known. The sum might start with a 9 with carry False, or it could start with a 0 with carry True, depending on whether there is a carry coming out of adding the second digit.

Now consider the sum 74⊥⊥ + 13⊥⊥.

Just from looking at the first digits, we can deduce that the overall carry is False Similarly, looking at the second digits, we know their carry is also False, which means we also know that the first sum is 7 + 1 + 0 == 8. Let’s see how addsR does with this example:

*AddingMachines> swap $ addsR add10 [7,4,⊥,⊥] [1,3,⊥,⊥] False
(*** Exception: Prelude.⊥

To get more information out, rewrite addBase to be laxer in the carry argument. Where possible, compute the carry-out based solely on the digits being added, without considering the carry-in. The carry-out must be false if those digits sum to less than base-1, and must be true if the digits sum to more than base-1. Only when the sum is exactly equal to base-1 must the carry-in be examined.

addBase ∷ (Num a, Ord a) => a → Adder a
addBase base a b carry =
  case ab `compare` (base - 1) of
    LT → uncarried
    GT → carried
    EQ → if sum' < base then uncarried else carried
  where
    ab = a + b
    sum' = ab + if carry then 1 else 0
    uncarried = (sum',False)
    carried = (sum'-base,True)

Sure enough, we can now extract the overall carry and the most significant digit of the sum:

*AddingMachines> swap $ addsR add10 [7,4,⊥,⊥] > [1,3,⊥,⊥] False
(False,[8,*** Exception: Prelude.⊥

We can also handle an unknown or infinite number of digits:

*AddingMachines> swap $ addsR add10 (7:4:⊥) (1:3:⊥) ⊥
(False,[8,*** Exception: Prelude.⊥

Note that even the carry-in flag is undefined, as it wouldn’t be used until the least-significant digit. If we know we’ll have infinitely many digits, then we can discard the carry-in bit altogether for addsR.

Pseudo-parallel, reverse carry

Recall pseudo-parallel forward-carrying addition from above:

addsP add as bs i = (cs,last is)
 where
   (cs,os) = unzip (zipWith3 add as bs is)
   is = i : os

Reversing the digits requires shifting the carries in the other direction. Instead of rotating the external carry into the start of the carry list and removing the last element, we’ll rotate the external carry into the end of the carry list and remove the first element.

addsPR ∷ Adder a → Adder [a]
addsPR add as bs i = (cs,head is)
 where
   (cs,os) = unzip (zipWith3 add as bs (tail is))
   is = os ++ [i]

Testing this definition, even on fully defined input, shows that neither the carry nor the digits produce any information.

After some head-scratching, it occurred to me that zipWith3 is overly strict in its last for our purposes. The definition:

zipWith3 ∷ (a → b → c → d) → [a] → [b] → [c] → [d]
zipWith3 f (a:as) (b:bs) (c:cs) = f a b c : zipWith3 f as bs cs
zipWith3 _ _ _ _ = []

With this definition, zipWith3 cannot produce any information until it knows whether its last argument is empty or non-empty. With some thought, we can see that in our use, the last list has the same length as the shorter of the other two lists, but the compiler doesn’t know, so it uses an unnecessary run-time check (that cannot complete). We can avoid this problem by using a lazier version of zipWith3. The only difference is the use of a lazy pattern for the last argument.

zipWith3' ∷ (a → b → c → d) → [a] → [b] → [c] → [d]
zipWith3' f (a:as) (b:bs) ~(c:cs) = f a b c : zipWith3' f as bs cs
zipWith3' _ _ _ _ = []

Then amend the definition of addsPR to use zipWith3', and away we go:

*AddingMachines> addsPS add10 (7<:>4<:>⊥) (1<:>3<:>⊥)
(False,8 <:> *** Exception: Prelude.⊥

Recall that in the original definition of addsP, the last argument to zipWith3 was is instead of the more fitting init is. When I tried init is, evaluation gets stuck at the beginning. Switching from zipWith3 to zipWith3' gets the first three digits out but not the last one. I suspect the reason has to do with the definition of init, which has to know when the list has only one element, rather than being almost empty.

Now let’s look a a really infinite examples

  .357835783578...
+ .726726726726...
------------------
 1.08456251030?...

The “?” is for a digit that can be either 4 or 5, depending on carry. The incoming carry never gets used, so let’s make it ⊥.

*AddingMachines> swap $ addsPR add10 (cycle [3,5,7,8]) (cycle [7,2,6]) ⊥
(True,[0,8,4,5,6,2,5,

The computation got stuck after after 1.0845625.

It took me a fair bit of poking around to find the problem. A clue along the way was this surprise:

*AddingMachines> length $ fst $ unzip [⊥]
*** Exception: Prelude.⊥

The culprit turns out to be an unfortunate interaction between the definitions of unzip and addBase:

unzip ∷ [(a,b)] → ([a],[b])
unzip =  foldr (λ(a,b) ~(as,bs) → (a:as,b:bs)) ([],[])

Look carefully at the function passed to foldr. It’s lazy in its second argument and strict in its first. The laziness here allows the input list to be demanded incrementally as the output list is demanded. Without that laziness, unzip couldn’t handle infinite lists. The strict pattern in the first position means that each incoming value must be evaluated enough to see the outer pair structure. Thus, e.g., unzip [⊥] is ⊥, not (⊥,⊥).

Here’s a lazier version that works for addsPR:

unzip' = foldr (λ ~(a,b) ~(as,bs) → (a:as,b:bs)) ([],[])

*AddingMachines> length $ fst $ unzip' [⊥]
1

Now look our (second) definition of addBase from above:

addBase ∷ (Num a, Ord a) => a → Adder a
addBase base a b carry =
  case ab `compare` (base - 1) of
    LT → uncarried
    GT → carried
    EQ → if sum' < base then uncarried else carried
  where
    ab = a + b
    sum' = ab + if carry then 1 else 0
    uncarried = (sum',False)
    carried = (sum'-base,True)

Consider the EQ case, i.e., a+b == base-1. Evaluating the conditional produces either uncarried or carried, each of which is manifestly a pair. However, no outer pair structure can be seen until the boolean resolves to either True or False. In this case, the boolean depends on sum', which depends on carry. Thus pairness cannot be seen until carry is evaluated.

Using unzip' instead of unzip in addsPR gets us unstuck:

*AddingMachines> swap $ addsPR add10 (cycle [3,5,7,8]) (cycle [7,2,6]) ⊥
(True,[0,8,4,5,6,2,5,1,0,3,0,5,0,8,4,5,6,2,5,1,0,3,0,5,0,8,4,5,6,2,‥.])

(I filled in the “‥.” here.)

Some other lax solutions

Rather than making unzip demand less information, we can instead make addBase provide more information.

Hack away

One way to make addBase more defined is to dig into the definition of addBase, changing the EQ case to generate a pair immediately. For instance,

    EQ → (if sum' < base then sum' else sum'-base, sum' < base)

This version has a lot of repetition, adding more awkwardness to an already awkward definition of addBase.

Laxer if-then-else

I recently wrote two posts on “lazier functional programming”. Part one offered the puzzle of how to make if-then-else and either laxer (less strict). Part two revealed elegant solutions in terms of the least-upper-bound and greatest-lower-bound operators ((⊔) and (⊓)) from domain theory.

The laxer if-then-else from part two gives us a drop-in replacement for the standard, overly strict conditional used in addBase:

    EQ → laxIf (sum' < base) uncarried carried

where

laxIf ∷ a → a → Bool → a
laxIf c a b = cond a b c

cond ∷ a → a → Bool → a
cond a b = const (a ⊓ b) ⊔ (λ c → if c then a else b)

The key to fixing addBase is that uncarried ⊓ carried == (⊥,⊥).

Lax pairs

A more specialized solution is to use the standard if-then-else and draw out the pair structure of the result without looking at it.

laxPair ∷ (a,b) → (a,b)
laxPair  ~(a,b) = (a,b)

We can apply laxPair directly to the conditional:

    EQ → laxPair $ if sum' < base then uncarried else carried

or to the body of addBase as a whole:

addBase ∷ (Num a, Ord a) => a → Adder a
addBase base a b carry = laxPair $ ‥.

or even from the outside:

addBase' ∷ (Num a, Ord a) => a → Adder a
addBase' base a b carry = laxPair $ addBase base a b carry

In this last case, semantic editor combinators allow for a more elegant formulation:

addBase' = (result.result.result.result) laxPair addBase

Starting over

I’m not happy with any of the addBase definitions above after the first one, which was much too strict. The others get the job done, but heavy handedly, lacking in grace and elegance.

Here is a definition that is more graceful and is lax enough for our purposes:

addBase ∷ (Num a, Ord a) => a → Adder a
addBase base a b i = (ab', o)
 where
   ab  = a + b
   ab' = ab + (if i then 1 else 0) - (if o then base else 0)
   o   = case ab `compare` (base-1) of
           LT → False
           GT → True
           EQ → i

Written more compactly, and exploiting the short-circuiting nature of (&&) and (||) (each non-strict in its second argument):

   o   = ab >= base-1 && (ab >= base || i)

Infinite digit streams

Some of the previous cases were tricky due to testing for empty lists. To eliminate these details, we can switch from (possibly-finite) lists to (necessary-infinite) streams. I’ll use Wouter Swierstra’s Stream library.

With infinite digit streams, we have no use for a carry in, so I’ll switch from a full adder to a half adder.

type HalfAdder a = a → a → (a,Bool)

The Stream library doesn’t define a zipWith3, but the general liftA3 function on applicative functors fills the same role. Eliminating the carry-in allows the definition to become more tightly circular:

addsPS ∷ Adder a → HalfAdder (Stream a)
addsPS add as bs = (cs, S.head is)
 where
   (cs,is) = S.unzip (liftA3 add as bs (S.tail is))

I’m using qualified names for head, tail, and unzip on streams to avoid clashes with the versions on lists.

Giving this stream adder a spin, we get the same infinite sum as with lists above:

*AddingMachines> swap $ addsPS add10 (S.fromList $ cycle [3,5,7,8])
                                     (S.fromList $ > cycle [7,2,6])
(True,0<:>8<:>4<:>5<:>6<:>2<:>5<:>1<:>0<:>3<:>0<:>5<:>0<:>8<:>4<:>5 ‥.)

Reverse state monad

Can we combine reverse-carrying with a state monad formulation? Yes, using the “backwards state” monad mentioned in The essence of functional programming, Section 2.8.

In ReverseStateAdd.hs, define

type Carrier = StateR Bool

Given the reverse state monad, the single-digit addition is as easy to define as with (forward) State:

addBase ∷ (Ord a, Num a) => a → Adder a
addBase = (result.result.result) StateR AM.addBase

For convenience, swap the carry & digits while testing.

runStateR' ∷ StateR s a → s → (s,a)
runStateR' = (result.result) swap runStateR

Try with finitely many digits:

*ReverseStateAdd> runStateR' (adds add10 [3,5,7,8] [1,6,4,1]) False
(False,[5,2,1,9])
*ReverseStateAdd> runStateR' (adds add10 [8,7,5,3] [1,4,6,1]) False
(True,[0,2,1,4])

And infinitely many digits:

*ReverseStateAdd> runStateR' (adds add10 (cycle [3,5,7,8]) (cycle [7,2,6])) ⊥
(True,[0,8,4,5,6,2,5,1,0,3,0,5,0,8,4,5,6,2,5,1,0,3,0,5,0,8,4,5,6,2,5,1 ‥.])

Implementing the reverse state monad

The reverse (backward) state is defined just as State (forward state):

newtype StateR s a = StateR { runStateR ∷ s → (a,s) }

runStateR' ∷ StateR s a → s → (s,a)
runStateR' = (result.result) swap runStateR

The Functor instance is also defined as with State:

instance Functor (StateR s) where
  fmap f (StateR h) = StateR (λ s → let (a,s') = h s in (f a, s'))

Using the ideas from Prettier functions for wrapping and wrapping and the notational improvement from Matt Hellige’s Pointless fun, we can get a much more elegant definition:

instance Functor (StateR s) where
  fmap = inStateR . result . first

where

inStateR ∷ ((s → (a,s)) → (t → (b,t)))
         → (StateR s a  → StateR t b)
inStateR = runStateR ~> StateR

The Monad instance shows how the flow of state is reversed. The incoming state flows into the second action, which produces a state for the first action.

instance Monad (StateR s) where
  return a = StateR $ λ s → (a,s)
  m >>= k  = StateR $ λ u → let (a,s) = runStateR m t
                                 (b,t) = runStateR (k a) u
                             in
                               (b,s)

Although zipWithM is defined for monads, it can be defined more generally, for arbitrary applicative functors, so we really only required that StateR be an applicative functor. I wonder whether there are example uses of the backward state monad that need the full expressive power of the Monad interface, as opposed to the simpler & more general Applicative interface. The Applicative instance of StateR is more straightforward, as there is no longer information flowing in opposite directions:

instance Applicative (StateR s) where
  pure a = StateR $ λ s → (a,s)
  StateR hf <*> StateR hx = StateR $ λ u →
    let (x,t) = hx u
        (f,s) = hf t
    in
      (f x,s)

Closing thoughts

Adding digits in right-to-left (most to least significant) order saves effort when decisions can be based on approximations rather than exact values. To see how very common this situation is, consider the popularity of finite-precision floating point representations like Float and Double. Whenever we use these representations, we’re computing with only most significant digits. Although efficient, thanks to hardware support, these common types lack the sort of modularity that characterizes pure, lazy functional programming. Commitment to a particular finite precision up front computes too little information, leading to incorrect results, or too much information, leading to wasted time and power.

The requirement of choosing precision up front breaks modularity, because the best choice depends on how intermediate computed values are consumed. Modularity insists that values be specified independently from their uses. Exactly this issue motivates laziness, as explained and illustrated in Why Functional Programming Matters. Requiring ourselves to program with types like Float and Double is thus like having to choose between fixed-length list types, with elements getting lost off the end when the preselected length is exceeded.

Of course the representations in this post are nowhere near suited to replace hardware-supported, inexact numerics. Still, they’re fun to play with, and they illustrate some functional programming techniques, while restoring compositionality/modularity and simple, precise semantics.