Archive for February 2011

## Deriving list scans

I’ve been playing with deriving efficient parallel, imperative implementations of "prefix sum" or more generally "left scan". Following posts will explore the parallel & imperative derivations, but as a warm-up, I’ll tackle the functional & sequential case here.

### Folds

You’re probably familiar with the higher-order functions for left and right "fold". The current documentation says:

`foldl`, applied to a binary operator, a starting value (typically the left-identity of the operator), and a list, reduces the list using the binary operator, from left to right:

``foldl f z [x1, x2, ⋯, xn] ≡ (⋯((z `f` x1) `f` x2) `f`⋯) `f` xn``

The list must be finite.

`foldr`, applied to a binary operator, a starting value (typically the right-identity of the operator), and a list, reduces the list using the binary operator, from right to left:

``foldr f z [x1, x2, ⋯, xn] ≡ x1 `f` (x2 `f` ⋯ (xn `f` z)⋯)``

And here are typical definitions:

``foldl ∷ (b → a → b) → b → [a] → bfoldl f z []     = zfoldl f z (x:xs) = foldl f (z `f` x) xsfoldr ∷ (a → b → b) → b → [a] → bfoldr f z []     = zfoldr f z (x:xs) = x `f` foldr f z xs``

Notice that `foldl` builds up its result one step at a time and reveals it all at once, in the end. The whole result value is locked up until the entire input list has been traversed. In contrast, `foldr` starts revealing information right away, and so works well with infinite lists. Like `foldl`, `foldr` also yields only a final value.

Sometimes it’s handy to also get to all of the intermediate steps. Doing so takes us beyond the land of folds to the kingdom of scans.

### Scans

The `scanl` and `scanr` functions correspond to `foldl` and `foldr` but produce all intermediate accumulations, not just the final one.

``scanl ∷ (b → a → b) → b → [a] → [b]scanl f z [x1, x2,  ⋯ ] ≡ [z, z `f` x1, (z `f` x1) `f` x2, ⋯]scanr ∷ (a → b → b) → b → [a] → [b]scanr f z [⋯, xn_1, xn] ≡ [⋯, xn_1 `f` (xn `f` z), xn `f` z, z]``

As you might expect, the last value is the complete left fold, and the first value in the scan is the complete right fold:

``last (scanl f z xs) ≡ foldl f z xshead (scanr f z xs) ≡ foldr f z xs``

which is to say

``last ∘ scanl f z ≡ foldl f zhead ∘ scanr f z ≡ foldr f z``

The standard scan definitions are trickier than the fold definitions:

``scanl ∷ (b → a → b) → b → [a] → [b]scanl f z ls = z : (case ls of                     []   → []                     x:xs → scanl f (z `f` x) xs)scanr ∷ (a → b → b) → b → [a] → [b]scanr _ z []     = [z]scanr f z (x:xs) = (x `f` q) : qs                   where qs@(q:_) = scanr f z xs``

Every time I encounter these definitions, I have to walk through it again to see what’s going on. I finally sat down to figure out how these tricky definitions might emerge from simpler specifications. In other words, how to derive these definitions systematically from simpler but less efficient definitions.

Most likely, these derivations have been done before, but I learned something from the effort, and I hope you do, too.

Continue reading ‘Deriving list scans’ »

## From tries to trees

This post is the last of a series of six relating numbers, vectors, and trees, revolving around the themes of static size-typing and memo tries. We’ve seen that length-typed vectors form a trie for bounded numbers, and can handily represent numbers as well. We’ve also seen that n-dimensional vectors themselves have an elegant trie, which is the n-ary composition of the element type’s trie functor:

``type VTrie n a = Trie a :^ n ``

where for any functor `f` and natural number type `n`,

``f :^ n ≅ f ∘ ⋯ ∘ f  -- (n times)``

This final post in the series places this elegant mechanism of n-ary functor composition into a familiar & useful context, namely trees. Again, type-encoded Peano numbers are central. Just as `BNat` uses these number types to (statically) bound natural numbers (e.g., for a vector index or a numerical digit), and `Vec` uses number types to capture vector length, we’ll next use number types to capture tree depth.

Edits:

• 2011-02-02: Changes thanks to comments from Sebastian Fischer
• Added pointer to Memoizing polymorphic functions via unmemoization for derivation of `Tree d a ≅ [d] → a`.
• Fixed signatures for some `Branch` variants, bringing type parameter `a` into parens.
• Clarification about number of `VecTree` vs pairing constructors in remarks on left- vs right-folded trees.