## Parallel tree scanning by composition

My last few blog posts have been on the theme of *scans*, and particularly on *parallel* scans. In *Composable parallel scanning*, I tackled parallel scanning in a very general setting. There are five simple building blocks out of which a vast assortment of data structures can be built, namely constant (no value), identity (one value), sum, product, and composition. The post defined parallel prefix and suffix scan for each of these five "functor combinators", in terms of the same scan operation on each of the component functors. Every functor built out of this basic set thus has a parallel scan. Functors defined more conventionally can be given scan implementations simply by converting to a composition of the basic set, scanning, and then back to the original functor. Moreover, I expect this implementation could be generated automatically, similarly to GHC’s `DerivingFunctor`

extension.

Now I’d like to show two examples of parallel scan composition in terms of binary trees, namely the top-down and bottom-up variants of perfect binary leaf trees used in previous posts. (In previous posts, I used the terms "right-folded" and "left-folded" instead of "top-down" and "bottom-up".) The resulting two algorithms are expressed nearly identically, but have differ significantly in the work performed. The top-down version does $\Theta (n\phantom{\rule{0.167em}{0ex}}\mathrm{log}\phantom{\rule{0.167em}{0ex}}n)$ work, while the bottom-up version does only $\Theta (n)$, and thus the latter algorithm is work-efficient, while the former is not. Moreover, with a *very* simple optimization, the bottom-up tree algorithm corresponds closely to Guy Blelloch’s parallel prefix scan for arrays, given in *Programming parallel algorithms*. I’m delighted with this result, as I had been wondering how to think about Guy’s algorithm.

**Edit:**

- 2011-05-31: Added
`Scan`

and`Applicative`

instances for`T2`

and`T4`

.