Archive for March 2011

## Composable parallel scanning

The post Deriving list scans gave a simple specification of the list-scanning functions scanl and scanr, and then transformed those specifications into the standard optimized implementations. Next, the post Deriving parallel tree scans adapted the specifications and derivations to a type of binary trees. The resulting implementations are parallel-friendly, but not work-efficient, in that they perform $n\phantom{\rule{0.167em}{0ex}}\mathrm{log}\phantom{\rule{0.167em}{0ex}}n$ work vs linear work as in the best-known sequential algorithm.

Besides the work-inefficiency, I don’t know how to extend the critical initTs and tailTs functions (analogs of inits and tails on lists) to depth-typed, perfectly balanced trees, of the sort I played with in A trie for length-typed vectors and From tries to trees. The difficulty I encounter is that the functions initTs and tailTs make unbalanced trees out of balanced ones, so I don’t know how to adapt the specifications when types prevent the existence of unbalanced trees.

This new post explores an approach to generalized scanning via type classes. After defining the classes and giving a simple example, I’ll give a simple & general framework based on composing functor combinators.

Edits:

• 2011-03-02: Fixed typo. "constant functor is easiest" (instead of "identity functor"). Thanks, frguybob.
• 2011-03-05: Removed final unfinished sentence.
• 2011-07-28: Replace "assocL" with "assocR" in prefixScan derivation for g ∘ f.

Continue reading ‘Composable parallel scanning’ »

## Deriving parallel tree scans

The post Deriving list scans explored folds and scans on lists and showed how the usual, efficient scan implementations can be derived from simpler specifications.

Let’s see now how to apply the same techniques to scans over trees.

This new post is one of a series leading toward algorithms optimized for execution on massively parallel, consumer hardware, using CUDA or OpenCL.

Edits: