## Deriving list scans

I’ve been playing with deriving efficient parallel, imperative implementations of "prefix sum" or more generally "left scan". Following posts will explore the parallel & imperative derivations, but as a warm-up, I’ll tackle the functional & sequential case here.

### Folds

You’re probably familiar with the higher-order functions for left and right "fold". The current documentation says:

`foldl`

, applied to a binary operator, a starting value (typically the left-identity of the operator), and a list, reduces the list using the binary operator, from left to right:`foldl f z [x1, x2, ⋯, xn] ≡ (⋯((z `f` x1) `f` x2) `f`⋯) `f` xn`

The list must be finite.

`foldr`

, applied to a binary operator, a starting value (typically the right-identity of the operator), and a list, reduces the list using the binary operator, from right to left:`foldr f z [x1, x2, ⋯, xn] ≡ x1 `f` (x2 `f` ⋯ (xn `f` z)⋯)`

And here are typical definitions:

`foldl ∷ (b → a → b) → b → [a] → b`

foldl f z [] = z

foldl f z (x:xs) = foldl f (z `f` x) xs

foldr ∷ (a → b → b) → b → [a] → b

foldr f z [] = z

foldr f z (x:xs) = x `f` foldr f z xs

Notice that `foldl`

builds up its result one step at a time and reveals it all at once, in the end. The whole result value is locked up until the entire input list has been traversed. In contrast, `foldr`

starts revealing information right away, and so works well with infinite lists. Like `foldl`

, `foldr`

also yields only a final value.

Sometimes it’s handy to also get to all of the intermediate steps. Doing so takes us beyond the land of folds to the kingdom of scans.

### Scans

The `scanl`

and `scanr`

functions correspond to `foldl`

and `foldr`

but produce *all* intermediate accumulations, not just the final one.

`scanl ∷ (b → a → b) → b → [a] → [b]`

scanl f z [x1, x2, ⋯ ] ≡ [z, z `f` x1, (z `f` x1) `f` x2, ⋯]

scanr ∷ (a → b → b) → b → [a] → [b]

scanr f z [⋯, xn_1, xn] ≡ [⋯, xn_1 `f` (xn `f` z), xn `f` z, z]

As you might expect, the last value is the complete left fold, and the *first* value in the scan is the complete *right* fold:

`last (scanl f z xs) ≡ foldl f z xs`

head (scanr f z xs) ≡ foldr f z xs

which is to say

`last ∘ scanl f z ≡ foldl f z`

head ∘ scanr f z ≡ foldr f z

The standard scan definitions are trickier than the fold definitions:

`scanl ∷ (b → a → b) → b → [a] → [b]`

scanl f z ls = z : (case ls of

[] → []

x:xs → scanl f (z `f` x) xs)

scanr ∷ (a → b → b) → b → [a] → [b]

scanr _ z [] = [z]

scanr f z (x:xs) = (x `f` q) : qs

where qs@(q:_) = scanr f z xs

Every time I encounter these definitions, I have to walk through it again to see what’s going on. I finally sat down to figure out how these tricky definitions might *emerge* from simpler specifications. In other words, how to *derive* these definitions systematically from simpler but less efficient definitions.

Most likely, these derivations have been done before, but I learned something from the effort, and I hope you do, too.

Continue reading ‘Deriving list scans’ »