Parallel tree scanning by composition
My last few blog posts have been on the theme of scans, and particularly on parallel scans. In Composable parallel scanning, I tackled parallel scanning in a very general setting. There are five simple building blocks out of which a vast assortment of data structures can be built, namely constant (no value), identity (one value), sum, product, and composition. The post defined parallel prefix and suffix scan for each of these five "functor combinators", in terms of the same scan operation on each of the component functors. Every functor built out of this basic set thus has a parallel scan. Functors defined more conventionally can be given scan implementations simply by converting to a composition of the basic set, scanning, and then back to the original functor. Moreover, I expect this implementation could be generated automatically, similarly to GHC’s DerivingFunctor
extension.
Now I’d like to show two examples of parallel scan composition in terms of binary trees, namely the top-down and bottom-up variants of perfect binary leaf trees used in previous posts. (In previous posts, I used the terms "right-folded" and "left-folded" instead of "top-down" and "bottom-up".) The resulting two algorithms are expressed nearly identically, but have differ significantly in the work performed. The top-down version does work, while the bottom-up version does only , and thus the latter algorithm is work-efficient, while the former is not. Moreover, with a very simple optimization, the bottom-up tree algorithm corresponds closely to Guy Blelloch’s parallel prefix scan for arrays, given in Programming parallel algorithms. I’m delighted with this result, as I had been wondering how to think about Guy’s algorithm.
Edit:
- 2011-05-31: Added
Scan
andApplicative
instances forT2
andT4
.
Scanning via functor combinators
In Composable parallel scanning, we saw the Scan
class:
class Scan f where
prefixScan, suffixScan ∷ Monoid m ⇒ f m → (m, f m)
Given a structure of values, the prefix and suffix scan methods generate the overall fold
(of type m
), plus a structure of the same type as the input. (In contrast, the usual Haskell scanl
and scanr
functions on lists yield a single list with one more element than the source list. I changed the interface for generality and composability.) The post gave instances for the basic set of five functor combinators.
Most functors are not defined via the basic combinators, but as mentioned above, we can scan by conversion to and from the basic set. For convenience, encapsulate this conversion in a type class:
class EncodeF f where
type Enc f ∷ * → *
encode ∷ f a → Enc f a
decode ∷ Enc f a → f a
and define scan functions via EncodeF
:
prefixScanEnc, suffixScanEnc ∷
(EncodeF f, Scan (Enc f), Monoid m) ⇒ f m → (m, f m)
prefixScanEnc = second decode ∘ prefixScan ∘ encode
suffixScanEnc = second decode ∘ suffixScan ∘ encode
Lists
As a first example, consider
instance EncodeF [] where
type Enc [] = Const () + Id × []
encode [] = InL (Const ())
encode (a : as) = InR (Id a × as)
decode (InL (Const ())) = []
decode (InR (Id a × as)) = a : as
And declare a boilerplate Scan
instance via EncodeF
:
instance Scan [] where
prefixScan = prefixScanEnc
suffixScan = suffixScanEnc
I haven’t checked the details, but I think with this instance, suffix scanning has okay performance, while prefix scan does quadratic work. The reason is the in the Scan
instance for products, the two components are scanned independently (in parallel), and then the whole second component is adjusted for prefixScan
, while the whole first component is adjusted for suffixScan
. In the case of lists, the first component is the list head, and second component is the list tail.
For your reading convenience, here’s that Scan
instance again:
instance (Scan f, Scan g, Functor f, Functor g) ⇒ Scan (f × g) where
prefixScan (fa × ga) = (af ⊕ ag, fa' × ((af ⊕) <$> ga'))
where (af,fa') = prefixScan fa
(ag,ga') = prefixScan ga
suffixScan (fa × ga) = (af ⊕ ag, ((⊕ ag) <$> fa') × ga')
where (af,fa') = suffixScan fa
(ag,ga') = suffixScan ga
The lop-sidedness of the list type thus interferes with parallelization, and makes the parallel scans perform much worse than cumulative sequential scans.
Let’s next look at a more balanced type.
Binary Trees
We’ll get better parallel performance by organizing our data so that we can cheaply partition it into roughly equal pieces. Tree types allows such partitioning.
Top-down trees
We’ll try a few variations, starting with a simple binary tree.
data T1 a = L1 a | B1 (T1 a) (T1 a) deriving Functor
Encoding and decoding is straightforward:
instance EncodeF T1 where
type Enc T1 = Id + T1 × T1
encode (L1 a) = InL (Id a)
encode (B1 s t) = InR (s × t)
decode (InL (Id a)) = L1 a
decode (InR (s × t)) = B1 s t
instance Scan T1 where
prefixScan = prefixScanEnc
suffixScan = suffixScanEnc
Note that these definitions could be generated automatically from the data type definition.
For balanced trees, prefix and suffix scan divide the problem in half at each step, solve each half, and do linear work to patch up one of the two halves. Letting be the number of elements, and the work, we have the recurrence for some constant factor . By the Master theorem, therefore, the work done is . (Use case 2, with , , and .)
Again assuming a balanced tree, the computation dependencies have logarithmic depth, so the ideal parallel running time (assuming sufficient processors) is . Thus we have an algorithm that is depth-efficient (modulo constant factors) but work-inefficient.
Composition
A binary tree as defined above is either a leaf or a pair of binary trees. We can make this pair-ness more explicit with a reformulation:
data T2 a = L2 a | B2 (Pair (T2 a)) deriving Functor
where Pair
, as in Composable parallel scanning, is defined as
data Pair a = a :# a deriving Functor
or even
type Pair = Id × Id
For encoding and decoding, we could use the same representation as with T1
, but let’s instead use a more natural one for the definition of T2
:
instance EncodeF T2 where
type Enc T2 = Id + Pair ∘ T2
encode (L2 a) = InL (Id a)
encode (B2 st) = InR (O st)
decode (InL (Id a)) = L2 a
decode (InR (O st)) = B2 st
Boilerplate scanning:
instance Scan T2 where
prefixScan = prefixScanEnc
suffixScan = suffixScanEnc
for which we’ll need an applicative instance:
instance Applicative T2 where
pure = L2
L2 f <*> L2 x = L2 (f x)
B2 (fs :# gs) <*> B2 (xs :# ys) = B2 ((fs <*> xs) :# (gs <*> ys))
_ <*> _ = error "T2 (<*>): structure mismatch"
The O
constructor is for functor composition.
With a small change to the tree type, we can make the composition of Pair
and T
more explicit:
data T3 a = L3 a | B3 ((Pair ∘ T3) a) deriving Functor
Then the conversion becomes even simpler, since there’s no need to add or remove O
wrappers:
instance EncodeF T3 where
type Enc T3 = Id + Pair ∘ T3
encode (L3 a) = InL (Id a)
encode (B3 st) = InR st
decode (InL (Id a)) = L3 a
decode (InR st) = B3 st
Bottom-up trees
In the formulations above, a non-leaf tree consists of a pair of trees. I’ll call these trees "top-down", since visible pair structure begins at the top.
With a very small change, we can instead use a tree of pairs:
data T4 a = L4 a | B4 (T4 (Pair a)) deriving Functor
Again an applicative instance allows a standard Scan
instance:
instance Scan T4 where
prefixScan = prefixScanEnc
suffixScan = suffixScanEnc
instance Applicative T4 where
pure = L4
L4 f <*> L4 x = L4 (f x)
B4 fgs <*> B4 xys = B4 (liftA2 h fgs xys)
where h (f :# g) (x :# y) = f x :# g y
_ <*> _ = error "T4 (<*>): structure mismatch"
or a more explicitly composed form:
data T5 a = L5 a | B5 ((T5 ∘ Pair) a) deriving Functor
I’ll call these new variations "bottom-up" trees, since visible pair structure begins at the bottom. After stripping off the branch constructor, B4
, we can get at the pair-valued leaves by means of fmap
, fold
, or traverse
(or variations). For B5
, we’d also have to strip off the O
wrapper (functor composition).
Encoding is nearly the same as with top-down trees. For instance,
instance EncodeF T4 where
type Enc T4 = Id + T4 ∘ Pair
encode (L4 a) = InL (Id a)
encode (B4 t) = InR (O t)
decode (InL (Id a)) = L4 a
decode (InR (O t)) = B4 t
Scanning pairs
We’ll need to scan on the Pair
functor. If we use the definition of Pair
above in terms of Id
and (×)
, then we’ll get scanning for free. For using Pair
, I find the explicit data type definition above more convenient. We can then derive a Scan
instance by conversion. Start with a standard specification:
data Pair a = a :# a deriving Functor
And encode & decode explicitly:
instance EncodeF Pair where
type Enc Pair = Id × Id
encode (a :# b) = Id a × Id b
decode (Id a × Id b) = a :# b
Then use our boilerplate Scan
instance for EncodeF
instances:
instance Scan Pair where
prefixScan = prefixScanEnc
suffixScan = suffixScanEnc
We’ve seen the Scan
instance for (×)
above. The instance for Id
is very simple:
newtype Id a = Id a
instance Scan Id where
prefixScan (Id m) = (m, Id ∅)
suffixScan = prefixScan
Given these definitions, we can calculate a more streamlined Scan
instance for Pair
:
prefixScan (a :# b)
≡ {- specification -}
prefixScanEnc (a :# b)
≡ {- prefixScanEnc definition -}
(second decode ∘ prefixScan ∘ encode) (a :# b)
≡ {- (∘) -}
second decode (prefixScan (encode (a :# b)))
≡ {- encode definition for Pair -}
second decode (prefixScan (Id a × Id b))
≡ {- prefixScan definition for f × g -}
second decode
(af ⊕ ag, fa' × ((af ⊕) <$> ga'))
where (af,fa') = prefixScan (Id a)
(ag,ga') = prefixScan (Id b)
≡ {- Definition of second on functions -}
(af ⊕ ag, decode (fa' × ((af ⊕) <$> ga')))
where (af,fa') = prefixScan (Id a)
(ag,ga') = prefixScan (Id b)
≡ {- prefixScan definition for Id -}
(af ⊕ ag, decode (fa' × ((af ⊕) <$> ga')))
where (af,fa') = (a, Id ∅)
(ag,ga') = (b, Id ∅)
≡ {- substitution -}
(a ⊕ b, decode (Id ∅ × ((a ⊕) <$> Id ∅)))
≡ {- fmap/(<$>) for Id -}
(a ⊕ b, decode (Id ∅ × Id (a ⊕ ∅)))
≡ {- Monoid law -}
(a ⊕ b, decode (Id ∅ × Id a))
≡ {- decode definition on Pair -}
(a ⊕ b, (∅ :# a))
Whew! And similarly for suffixScan
.
Now let’s recall the Scan
instance for Pair
given in Composable parallel scanning:
instance Scan Pair where
prefixScan (a :# b) = (a ⊕ b, (∅ :# a))
suffixScan (a :# b) = (a ⊕ b, (b :# ∅))
Hurray! The derivation led us to the same definition. A "sufficiently smart" compiler could do this derivation automatically.
With this warm-up derivation, let’s now turn to trees.
Scanning trees
Given the tree encodings above, how does scan work? We’ll have to consult Scan
instances for some of the functor combinators. The product instance is repeated above. We’ll also want the instances for sum and composition. Omitting the suffixScan
definitions for brevity:
data (f + g) a = InL (f a) | InR (g a)
instance (Scan f, Scan g) ⇒ Scan (f + g) where
prefixScan (InL fa) = second InL (prefixScan fa)
prefixScan (InR ga) = second InR (prefixScan ga)
newtype (g ∘ f) a = O (g (f a))
instance (Scan g, Scan f, Functor f, Applicative g) ⇒ Scan (g ∘ f) where
prefixScan = second (O ∘ fmap adjustL ∘ zip)
∘ assocR
∘ first prefixScan
∘ unzip
∘ fmap prefixScan
∘ unO
This last definition uses a few utility functions:
zip ∷ Applicative g ⇒ (g a, g b) → g (a,b)
zip = uncurry (liftA2 (,))
unzip ∷ Functor g ⇒ g (a,b) → (g a, g b)
unzip = fmap fst &&& fmap snd
assocR ∷ ((a,b),c) → (a,(b,c))
assocR ((a,b),c) = (a,(b,c))
adjustL ∷ (Functor f, Monoid m) ⇒ (m, f m) → f m
adjustL (m, ms) = (m ⊕) <$> ms
Let’s consider how the Scan (g ∘ f)
instance plays out for top-down vs bottom-up trees, given the functor-composition encodings above. The critical definitions:
type Enc T2 = Id + Pair ∘ T2
type Enc T4 = Id + T4 ∘ Pair
Focusing on the branch case, we have Pair ∘ T2
vs T4 ∘ Pair
, so we’ll use the Scan (g ∘ f)
instance either way. Let’s consider the work implied by that instance. There are two calls to prefixScan
, plus a linear amount of other work. The meanings of those two calls differ, however:
- For top-down trees (
T2
), the recursive tree scans are infmap prefixScan
, mapping over the pair of trees. Thefirst prefixScan
is a pair scan and so does constant work. Since there are two recursive calls, each working on a tree of half size (assuming balance), plus linear other work, the total work , as explained above. - For bottom-up trees (
T4
), there is only one recursive recursive tree scan, which appears infirst prefixScan
. TheprefixScan
infmap prefixScan
is pair scan and so does constant work but is mapped over the half-sized tree (of pairs), and so does linear work altogether. Since there only one recursive tree scan, at half size, plus linear other work, the total work is then proportional to . So we have a work-efficient algorithm!
Looking deeper
In addition to the simple analysis above of scanning over top-down and over bottom-up, let’s look in detail at what transpires and how each case can be optimized. This section may well have more detail than you’re interested in. If so, feel free to skip ahead.
Top-down
Beginning as with Pair
,
prefixScan t
≡ {- specification -}
prefixScanEnc t
≡ {- prefixScanEnc definition -}
(second decode ∘ prefixScan ∘ encode) t
≡ {- (∘) -}
second decode (prefixScan (encode t))
Take T2
, with T3
being quite similar. Now split into two cases for the two constructors of T2
. First leaf:
prefixScan (L2 m)
≡ {- as above -}
second decode (prefixScan (encode (L2 m)))
≡ {- encode for L2 -}
second decode (prefixScan (InL (Id m)))
≡ {- prefixScan for functor sum -}
second decode (second InL (prefixScan (Id m)))
≡ {- prefixScan for Id -}
second decode (second InL (m, Id ∅))
≡ {- second for functions -}
second decode (m, InL (Id ∅))
≡ {- second for functions -}
(m, decode (InL (Id ∅)))
≡ {- decode for L2 -}
(m, L2 ∅)
Then branch:
prefixScan (B2 (s :# t))
≡ {- as above -}
second decode (prefixScan (encode (B2 (s :# t))))
≡ {- encode for L2 -}
second decode (prefixScan (InR (O (s :# t))))
≡ {- prefixScan for (+) -}
second decode (second InR (prefixScan (O (s :# t))))
≡ {- property of second -}
second (decode ∘ InR) (prefixScan (O (s :# t)))
Focus on the prefixScan
application:
prefixScan (O (s :# t)) =
≡ {- prefixScan for (∘) -}
( second (O ∘ fmap adjustL ∘ zip) ∘ assocR ∘ first prefixScan
∘ unzip ∘ fmap prefixScan ∘ unO ) (O (s :# t))
≡ {- unO/O -}
( second (O ∘ fmap adjustL ∘ zip) ∘ assocR ∘ first prefixScan
∘ unzip ∘ fmap prefixScan ) (s :# t)
≡ {- fmap on Pair -}
(second (O ∘ fmap adjustL ∘ zip) ∘ assocR ∘ first prefixScan ∘ unzip)
(prefixScan s :# prefixScan t)
≡ {- expand prefixScan -}
(second (O ∘ fmap adjustL ∘ zip) ∘ assocR ∘ first prefixScan ∘ unzip)
((ms,s') :# (mt,t'))
where (ms,s') = prefixScan s
(mt,t') = prefixScan t
≡ {- unzip -}
(second (O ∘ fmap adjustL ∘ zip) ∘ assocR ∘ first prefixScan)
((ms :# mt), (s' :# t')) where ⋯
≡ {- first -}
(second (O ∘ fmap adjustL ∘ zip) ∘ assocR)
(prefixScan (ms :# mt), (s' :# t')) where ⋯
≡ {- prefixScan for Pair -}
(second (O ∘ fmap adjustL ∘ zip) ∘ assocR)
((ms ⊕ mt, (∅ :# ms)), (s' :# t')) where ⋯
≡ {- assocR -}
(second (O ∘ fmap adjustL ∘ zip))
(ms ⊕ mt, ((∅ :# ms), (s' :# t'))) where ⋯
≡ {- second -}
( ms ⊕ mt
, (O ∘ fmap adjustL ∘ zip) ((∅ :# ms), (s' :# t')) ) where ⋯
≡ {- zip -}
( ms ⊕ mt
, (O ∘ fmap adjustL) ((∅,s') :# (ms,t')) ) where ⋯
≡ {- fmap for Pair -}
( ms ⊕ mt
, O (adjustL (∅,s') :# adjustL (ms,t')) ) where ⋯
≡ {- adjustL -}
( ms ⊕ mt
, O (((∅ ⊕) <$> s') :# ((ms ⊕) <$> t')) ) where ⋯
≡ {- Monoid law (left identity) -}
( ms ⊕ mt
, O ((id <$> s') :# ((ms ⊕) <$> t')) ) where ⋯
≡ {- Functor law (fmap id) -}
( ms ⊕ mt
, O (s' :# ((ms ⊕) <$> t')) )
where (ms,s') = prefixScan s
(mt,t') = prefixScan t
Continuing from above,
prefixScan (B2 (s :# t))
≡ {- see above -}
second (decode ∘ InR) (prefixScan (O (s :# t)))
≡ {- prefixScan focus from above -}
second (decode ∘ InR)
( ms ⊕ mt
, O (s' :# ((ms ⊕) <$> t')) )
where (ms,s') = prefixScan s
(mt,t') = prefixScan t
≡ {- definition of second on functions -}
(ms ⊕ mt, (decode ∘ InR) (O (s' :# ((ms ⊕) <$> t')))) where ⋯
≡ {- (∘) -}
(ms ⊕ mt, decode (InR (O (s' :# ((ms ⊕) <$> t'))))) where ⋯
≡ {- decode for B2 -}
(ms ⊕ mt, B2 (s' :# ((ms ⊕) <$> t'))) where ⋯
This final form is as in Deriving parallel tree scans, changed for the new scan interface. The derivation saved some work in wrapping & unwrapping and method invocation, plus one of the two adjustment passes over the sub-trees. As explained above, this algorithm performs work.
I’ll leave suffixScan
for you to do yourself.
Bottom-up
What happens if we switch from top-down to bottom-up binary trees? I’ll use T4
(though T5
would work as well):
data T4 a = L4 a | B4 (T4 (Pair a))
The leaf case is just as with T2
above, so let’s get right to branches.
prefixScan (B4 t)
≡ {- as above -}
second decode (prefixScan (encode (B4 t)))
≡ {- encode for L2 -}
second decode (prefixScan (InR (O t)))
≡ {- prefixScan for (+) -}
second decode (second InR (prefixScan (O t)))
≡ {- property of second -}
second (decode ∘ InR) (prefixScan (O t))
As before, now focus on the prefixScan
call.
prefixScan (O t) =
≡ {- prefixScan for (∘) -}
( second (O ∘ fmap adjustL ∘ zip) ∘ assocR ∘ first prefixScan
∘ unzip ∘ fmap prefixScan ∘ unO ) (O t)
≡ {- unO/O -}
( second (O ∘ fmap adjustL ∘ zip) ∘ assocR ∘ first prefixScan
∘ unzip ∘ fmap prefixScan ) t
≡ {- prefixScan on Pair (derived above) -}
(second (O ∘ fmap adjustL ∘ zip) ∘ assocR ∘ first prefixScan ∘ unzip)
fmap (λ (a :# b) → (a ⊕ b, (∅ :# a))) t
≡ {- unzip/fmap -}
(second (O ∘ fmap adjustL ∘ zip) ∘ assocR ∘ first prefixScan)
( fmap (λ (a :# b) → (a ⊕ b)) t
, fmap (λ (a :# b) → (∅ :# a)) t )
≡ {- first on functions -}
(second (O ∘ fmap adjustL ∘ zip) ∘ assocR)
( prefixScan (fmap (λ (a :# b) → (a ⊕ b)) t)
, fmap (λ (a :# b) → (∅ :# a)) t )
≡ {- expand prefixScan -}
(second (O ∘ fmap adjustL ∘ zip) ∘ assocR)
((mp,p'), fmap (λ (a :# b) → (∅ :# a)) t)
where (mp,p') = prefixScan (fmap (λ (a :# b) → (a ⊕ b)) t)
≡ {- assocR -}
(second (O ∘ fmap adjustL ∘ zip))
(mp, (p', fmap (λ (a :# b) → (∅ :# a)) t))
where ⋯
≡ {- second on functions -}
(mp, (O ∘ fmap adjustL ∘ zip) (p', fmap (λ (a :# b) → (∅ :# a)) t))
where ⋯
≡ {- fmap/zip/fmap -}
(mp, O (liftA2 tweak p' t))
where tweak s (a :# _) = adjustL (s, (∅ :# a))
(mp,p') = prefixScan (fmap (λ (a :# b) → (a ⊕ b)) t)
≡ {- adjustL, then simplify -}
(mp, O (liftA2 tweak p' t))
where tweak s (a :# _) = (s :# s ⊕ a)
(mp,p') = prefixScan (fmap (λ (a :# b) → (a ⊕ b)) t)
Now re-introduce the context of prefixScan (O t)
:
prefixScan (B4 t)
≡ {- see above -}
second (decode ∘ InR) (prefixScan (O t))
≡ {- see above -}
second (decode ∘ InR)
(mp, O (liftA2 tweak p' t))
where ⋯
≡ {- decode for T4 -}
(mp, B4 (liftA2 tweak p' t))
where p = fmap (λ (e :# o) → (e ⊕ o)) t
(mp,p') = prefixScan p
tweak s (e :# _) = (s :# s ⊕ e)
Notice how much this bottom-up tree scan algorithm differs from the top-down algorithm derived above. In particular, there’s only one recursive tree scan (on a half-sized tree) instead of two, plus linear additional work, for a total of work.
Guy Blelloch’s parallel scan algorithm
In Programming parallel algorithms, Guy Blelloch gives the following algorithm for parallel prefix scan, expressed in the parallel functional language NESL:
function scan(a) =
if #a ≡ 1 then [0]
else
let es = even_elts(a);
os = odd_elts(a);
ss = scan({e+o: e in es; o in os})
in interleave(ss,{s+e: s in ss; e in es})
This algorithm is nearly identical to the T4
scan algorithm above. I was very glad to find this route to Guy’s algorithm, which had been fairly mysterious to me. I mean, I could believe that the algorithm worked, but I had no idea how I might have discovered it myself. With the functor composition approach to scanning, I now see how Guy’s algorithm emerges as well as how it generalizes to other data structures.
Nested data types and parallelism
Most of the recursive algebraic data types that appear in Haskell programs are regular, meaning that the recursive instances are instantiated with the same type parameter as the containing type. For instance, a top-down tree of elements of type a
is either a leaf or has two trees whose elements have that same type a
. In contrast, in a bottom-up tree, the (single) recursively contained tree is over elements of type (a,a)
. Such non-regular data types are called "nested". The two tree scan algorithms above suggest to me that nested data types are particularly useful for efficient parallel algorithms.
Sjoerd Visscher:
I think prefixScan for T4 assumes a zip-like applicative instance for T4?
Guy Blelloch’s algorithm works on lists of any length, but your T4 trees can only contain 2^n elements. How should T4 be fixed so that can also contain any number of elements?
25 May 2011, 6:07 amconal:
For
T4
,prefixScan
is defined simply asprefixScanEnc
, without relying on anApplicative
instance.In Programming parallel algorithms, right above Figure 11 (containing the
scan
implementation), Guy says “The particular code shown works only on sequences that have length equal to a power of two, but it is not hard to generalize it to work on sequences of any length.”By using a more flexible data type, such as the binary random-access lists in Chris Okasaki’s Purely Functional Data Structures, section 9.2.1. Quoting from that section:
25 May 2011, 6:57 amSjoerd Visscher:
The optimized version of prefixScan (B4 t) ends up being (mp, B4 (liftA2 tweak p’ t)). Here “liftA2 tweak” lifts tweak to T4, right? And as the liftA2 comes from “zip = uncurry (liftA2 (,))”, I guess a zip-like instance is required?
25 May 2011, 7:19 amconal:
Ah, I see. You’re right. I’d forgotten that the
25 May 2011, 9:10 pmScan
instance forT4
does indeed depend on there being anApplicative
instance. And yes, I’d define it in a zip-like way. BecauseT4
has a sum and itsApplicative
is zip-like,(<*>)
andliftA2
will be partial, which is less than ideal. I hadn’t noticed this issue, because I was working out this algorithm for depth-typed trees, which have no (non-degenerate) sums and hence have total(<*>)
andliftA2
.Ryan Ingram:
T2’s applicative instance doesn’t follow the applicative laws, though.
for example:
pure id <*> x /= x
You can fix this by ‘cloning’ the leaf elements though:
L2 f <*> B2 (xs :# ys) = B2 (f xs :# f ys)
B2 (fs :# gs) <*> L2 x = B2 (($x) fs :# ($x) gs)
I think a similar trick works for T4.
1 June 2011, 2:09 pmConal Elliott » Blog Archive » A third view on trees:
[…] About « Parallel tree scanning by composition […]
3 June 2011, 6:46 pmconal:
Ryan: Oops! Thanks for the catch. I know the laws hold for depth-typed trees, since they’re tries (and thus isomorphic to functions, which do satisfy the laws), but I haven’t checked the laws for these non-depth-typed trees. I wonder whether they hold with your alteration.
17 June 2011, 9:41 am