Composable parallel scanning
The post Deriving list scans gave a simple specification of the list-scanning functions scanl
and scanr
, and then transformed those specifications into the standard optimized implementations. Next, the post Deriving parallel tree scans adapted the specifications and derivations to a type of binary trees. The resulting implementations are parallel-friendly, but not work-efficient, in that they perform work vs linear work as in the best-known sequential algorithm.
Besides the work-inefficiency, I don’t know how to extend the critical initTs
and tailTs
functions (analogs of inits
and tails
on lists) to depth-typed, perfectly balanced trees, of the sort I played with in A trie for length-typed vectors and From tries to trees. The difficulty I encounter is that the functions initTs
and tailTs
make unbalanced trees out of balanced ones, so I don’t know how to adapt the specifications when types prevent the existence of unbalanced trees.
This new post explores an approach to generalized scanning via type classes. After defining the classes and giving a simple example, I’ll give a simple & general framework based on composing functor combinators.
Edits:
- 2011-03-02: Fixed typo. "constant functor is easiest" (instead of "identity functor"). Thanks, frguybob.
- 2011-03-05: Removed final unfinished sentence.
- 2011-07-28: Replace "
assocL
" with "assocR
" inprefixScan
derivation forg ∘ f
.
Generalizing list scans
The left and right scan functions on lists have an awkward feature. The output list has one more element than the input list, corresponding to the fact that the number of prefixes (inits
) of a list is one more than the number of elements, and similarly for suffixes (tails
).
While it’s easy to extend a list by adding one more element, it’s not easy with other functors. In Deriving parallel tree scans, I simply removed the ∅
element from the scan. In this post, I’ll instead change the interface to produce an output of exactly the same shape, plus one extra element. The extra element will equal a fold
over the complete input. If you recall, we had to search for that complete fold in an input subtree in order to adjust the other subtree. (See headT
and lastT
and their generalizations in Deriving parallel tree scans.) Separating out this value eliminates the search.
Define a class with methods for prefix and suffix scan:
class Scan f where
prefixScan, suffixScan ∷ Monoid m ⇒ f m → (m, f m)
Prefix scans (prefixScan
) accumulate moving left-to-right, while suffix scans (suffixScan
) accumulate moving right-to-left.
A simple example: pairs
To get a first sense of generalized scans, let’s use see how to scan over a pair functor.
data Pair a = a :# a deriving (Eq,Ord,Show)
With GHC’s DeriveFunctor
option, we could also derive a Functor
instance, but for clarity, define it explicitly:
instance Functor Pair where
fmap f (a :# b) = (f a :# f b)
The scans:
instance Scan Pair where
prefixScan (a :# b) = (a ⊕ b, (∅ :# a))
suffixScan (a :# b) = (a ⊕ b, (b :# ∅))
As you can see, if we eliminated the ∅
elements, we could shift to the left or right and forgo the extra result.
Naturally, there is also a Fold
instance, and the scans produce the fold results as well sub-folds:
instance Foldable Pair where
fold (a :# b) = a ⊕ b
The Pair
functor also has unsurprising instances for Applicative
and Traversable
.
instance Applicative Pair where
pure a = a :# a
(f :# g) <*> (x :# y) = (f x :# g y)
instance Traversable Pair where
sequenceA (fa :# fb) = (:#) <$> fa <*> fb
We don’t really have to figure out how to define scans for every functor separately. We can instead look at how functors are are composed out of their essential building blocks.
Scans for functor combinators
To see how to scan over a broad range of functors, let’s look at each of the functor combinators, e.g., as in Elegant memoization with higher-order types.
Constant
The constant functor is easiest.
newtype Const x a = Const x
There are no values to accumulate, so the final result (fold) is ∅
.
instance Scan (Const x) where
prefixScan (Const x) = (∅, Const x)
suffixScan = prefixScan
Identity
The identity functor is nearly as easy.
newtype Id a = Id a
instance Scan Id where
prefixScan (Id m) = (m, Id ∅)
suffixScan = prefixScan
Sum
Scanning in a sum is just scanning in a summand:
data (f + g) a = InL (f a) | InR (g a)
instance (Scan f, Scan g) ⇒ Scan (f + g) where
prefixScan (InL fa) = second InL (prefixScan fa)
prefixScan (InR ga) = second InR (prefixScan ga)
suffixScan (InL fa) = second InL (suffixScan fa)
suffixScan (InR ga) = second InR (suffixScan ga)
These definitions correspond to simple "commutative diagram" properties, e.g.,
prefixScan ∘ InL ≡ second InL ∘ prefixScan
Product
Product scannning is a little trickier.
data (f × g) a = f a × g a
Scan each of the two parts separately, and then combine the final (fold
) part of one result with each of the non-final elements of the other.
instance (Scan f, Scan g, Functor f, Functor g) ⇒ Scan (f × g) where
prefixScan (fa × ga) = (af ⊕ ag, fa' × ((af ⊕) <$> ga'))
where (af,fa') = prefixScan fa
(ag,ga') = prefixScan ga
suffixScan (fa × ga) = (af ⊕ ag, ((⊕ ag) <$> fa') × ga')
where (af,fa') = suffixScan fa
(ag,ga') = suffixScan ga
Composition
Finally, composition is the trickiest.
newtype (g ∘ f) a = O (g (f a))
The target signatures:
prefixScan, suffixScan ∷ Monoid m ⇒ (g ∘ f) m → (m, (g ∘ f) m)
To find the prefix and suffix scan definitions, fiddle with types beginning at the domain type for prefixScan
or suffixScan
and arriving at the range type.
Some helpers:
zip ∷ Applicative g ⇒ (g a, g b) → g (a,b)
zip = uncurry (liftA2 (,))
unzip ∷ Functor g ⇒ g (a,b) → (g a, g b)
unzip = fmap fst &&& fmap snd
assocR ∷ ((a,b),c) → (a,(b,c))
assocR ((a,b),c) = (a,(b,c))
adjustL ∷ (Functor f, Monoid m) ⇒ (m, f m) → f m
adjustL (m, ms) = (m ⊕) <$> ms
adjustR ∷ (Functor f, Monoid m) ⇒ (m, f m) → f m
adjustR (m, ms) = (⊕ m) <$> ms
First prefixScan
:
gofm ∷ (g ∘ f) m
unO '' ∷ g (f m)
fmap prefixScan '' ∷ g (m, f m)
unzip '' ∷ (g m, g (f m))
first prefixScan '' ∷ ((m, g m), g (f m))
assocR '' ∷ (m, (g m, g (f m)))
second zip '' ∷ (m, g (m, f m))
second (fmap adjustL) '' ∷ (m, g (f m))
second O '' ∷ (m, (g ∘ f) m)
Then suffixScan
:
gofm ∷ (g ∘ f) m
unO '' ∷ g (f m)
fmap suffixScan '' ∷ g (m, f m)
unzip '' ∷ (g m, g (f m))
first suffixScan '' ∷ ((m, g m), g (f m))
assocR '' ∷ (m, (g m, g (f m)))
second zip '' ∷ (m, (g (m, f m)))
second (fmap adjustR) '' ∷ (m, (g (f m)))
second O '' ∷ (m, ((g ∘ f) m))
Putting together the pieces and simplifying just a bit leads to the method definitions:
instance (Scan g, Scan f, Functor f, Applicative g) ⇒ Scan (g ∘ f) where
prefixScan = second (O ∘ fmap adjustL ∘ zip)
∘ assocR
∘ first prefixScan
∘ unzip
∘ fmap prefixScan
∘ unO
suffixScan = second (O ∘ fmap adjustR ∘ zip)
∘ assocR
∘ first suffixScan
∘ unzip
∘ fmap suffixScan
∘ unO
What’s coming up?
- What might not easy to spot at this point is that the
prefixScan
andsuffixScan
methods given in this post do essentially the same job as in Deriving parallel tree scans, when the binary tree type is deconstructed into functor combinators. A future post will show this connection. - Switch from standard (right-folded) trees to left-folded trees (in the sense of A trie for length-typed vectors and From tries to trees), which reduces the running time from to .
- Scanning in place, i.e., destructively replacing the values in the input structure rather than allocating a new structure.
Jade NB:
Your post seems to end mid-sentence (unless it is just very dry humour!).
2 March 2011, 2:55 pmconal:
Thanks, Jade. I’ve eliminated that final sentence fragment.
5 March 2011, 11:57 amConal Elliott » Blog Archive » Parallel tree scanning by composition:
[…] About « Composable parallel scanning […]
24 May 2011, 12:31 pmTwan van Laarhoven:
You could implement prefixScan and suffixScan using mapAccumL and mapAccumR from Data.Traversable. Since Ghc can also derive Traversable, this way you need no extra manually written code for each data type.
26 May 2011, 11:58 pmconal:
Twan: Yes. I was thinking of using
27 May 2011, 4:39 pmmapAccumL
andmapAccumR
to defineprefixScan
andsuffixScan
as specifications. I wouldn’t want the resulting implementations, since they’re sequential, not parallel.