Archive for 22nd February 2011

Deriving list scans

I’ve been playing with deriving efficient parallel, imperative implementations of "prefix sum" or more generally "left scan". Following posts will explore the parallel & imperative derivations, but as a warm-up, I’ll tackle the functional & sequential case here.

Folds

You’re probably familiar with the higher-order functions for left and right "fold". The current documentation says:

foldl, applied to a binary operator, a starting value (typically the left-identity of the operator), and a list, reduces the list using the binary operator, from left to right:

foldl f z [x1, x2, ⋯, xn] ≡ (⋯((z `f` x1) `f` x2) `f`⋯) `f` xn

The list must be finite.

foldr, applied to a binary operator, a starting value (typically the right-identity of the operator), and a list, reduces the list using the binary operator, from right to left:

foldr f z [x1, x2, ⋯, xn] ≡ x1 `f` (x2 `f` ⋯ (xn `f` z)⋯)

And here are typical definitions:

foldl  (b → a → b) → b → [a] → b
foldl f z [] = z
foldl f z (x:xs) = foldl f (z `f` x) xs

foldr (a → b → b) → b → [a] → b
foldr f z [] = z
foldr f z (x:xs) = x `f` foldr f z xs

Notice that foldl builds up its result one step at a time and reveals it all at once, in the end. The whole result value is locked up until the entire input list has been traversed. In contrast, foldr starts revealing information right away, and so works well with infinite lists. Like foldl, foldr also yields only a final value.

Sometimes it’s handy to also get to all of the intermediate steps. Doing so takes us beyond the land of folds to the kingdom of scans.

Scans

The scanl and scanr functions correspond to foldl and foldr but produce all intermediate accumulations, not just the final one.

scanl  (b → a → b) → b → [a] → [b]

scanl f z [x1, x2, ⋯ ] ≡ [z, z `f` x1, (z `f` x1) `f` x2, ⋯]

scanr (a → b → b) → b → [a] → [b]

scanr f z [⋯, xn_1, xn] ≡ [⋯, xn_1 `f` (xn `f` z), xn `f` z, z]

As you might expect, the last value is the complete left fold, and the first value in the scan is the complete right fold:

last (scanl f z xs) ≡ foldl f z xs
head (scanr f z xs) ≡ foldr f z xs

which is to say

lastscanl f z ≡ foldl f z
headscanr f z ≡ foldr f z

The standard scan definitions are trickier than the fold definitions:

scanl  (b → a → b) → b → [a] → [b]
scanl f z ls = z : (case ls of
[] → []
x:xs → scanl f (z `f` x) xs)

scanr (a → b → b) → b → [a] → [b]
scanr _ z [] = [z]
scanr f z (x:xs) = (x `f` q) : qs
where qs@(q:_) = scanr f z xs

Every time I encounter these definitions, I have to walk through it again to see what’s going on. I finally sat down to figure out how these tricky definitions might emerge from simpler specifications. In other words, how to derive these definitions systematically from simpler but less efficient definitions.

Most likely, these derivations have been done before, but I learned something from the effort, and I hope you do, too.

Continue reading ‘Deriving list scans’ »