Fixing broken isomorphisms — details for non-strict memoization, part 2

The post Details for non-strict memoization, part 1 works out a systematic way of doing non-strict memoization, i.e., correct memoization of non-strict (and more broadly, non-hyper-strict) functions. As I mentioned at the end, there was an awkward aspect, which is that the purported “isomorphisms” used for regular types are not quite isomorphisms.

For instance, functions from triples are memoized by converting to and from nested pairs:

untriple ∷ (a,b,c) -> ((a,b),c)
untriple (a,b,c) = ((a,b),c)

triple ∷ ((a,b),c) -> (a,b,c)
triple ((a,b),c) = (a,b,c)

Then untriple and triple form an embedding/projection pair, i.e.,

triple ∘ untriple ≡ id
untriple ∘ triple ⊑ id

The reason for the inequality is that the nested-pair form permits (⊥,c), which does not correspond to any triple.

untriple (triple (⊥,c)) ≡ untriple ⊥ ≡ ⊥

Can we patch this problem by simply using an irrefutable (lazy) pattern in the definition of triple, i.e., triple (~(a,b),c) = (a,b,c)? Let’s try:

untriple (triple (⊥,c)) ≡ untriple (⊥,⊥,c) ≡ ((⊥,⊥),c)

So isomorphism fails and so does even the embedding/projection property.

Similarly, to deal with regular algebraic data types, I used a class that describes regular data types as repeated applications of a single, associated pattern functor (following A Lightweight Approach to Datatype-Generic Rewriting):

class Functor (PF t) ⇒ Regular t where
  type PF t ∷ * → *
  unwrap ∷ t → PF t t
  wrap   ∷ PF t t → t

Here unwrap converts a value into its pattern functor form, and wrap converts back. For example, here is the Regular instance I had used for lists:

instance Regular [a] where
  type PF [a] = Const () :+: Const a :*: Id

  unwrap []     = InL (Const ())
  unwrap (a:as) = InR (Const a :*: Id as)

  wrap (InL (Const ()))          = []
  wrap (InR (Const a :*: Id as)) = a:as

Again, we have an embedding/projection pair, rather than a genuine isomorphism:

wrap ∘ unwrap ≡ id
unwrap ∘ wrap ⊑ id

The inequality comes from ⊥ values occurring in PF [a] [a] at type Const () [a], (), (Const a :*: Id) [a], Const a [a], or Id [a].

Continue reading ‘Fixing broken isomorphisms — details for non-strict memoization, part 2’ »

Details for non-strict memoization, part 1

In Non-strict memoization, I sketched out a means of memoizing non-strict functions. I gave the essential insight but did not show the details of how a nonstrict memoization library comes together. In this new post, I give details, which are a bit delicate, in terms of the implementation described in Elegant memoization with higher-order types.

Near the end, I run into some trouble with regular data types, which I don’t know how to resolve cleanly and efficiently.

Edits:

  • 2010-09-10: Fixed minor typos.

Continue reading ‘Details for non-strict memoization, part 1’ »

Non-strict memoization

I’ve written a few posts about functional memoization. In one of them, Luke Palmer commented that the memoization methods are correct only for strict functions, which I had not noticed before. In this note, I correct this flaw, extending correct memoization to non-strict functions as well. The semantic notion of least upper bound (which can be built of unambiguous choice) plays a crucial role.

Edits:

  • 2010-07-13: Fixed the non-strict memoization example to use an argument of undefined (⊥) as intended.
  • 2010-07-23: Changed spelling from “nonstrict” to the much more popular “non-strict”.
  • 2011-02-16: Fixed minor typo. (“constraint on result” → “constraint on the result type”)

Continue reading ‘Non-strict memoization’ »

Exact numeric integration

This post describes a simple way to integrate a function over an interval and get an exact answer. The question came out of another one, which is how to optimally render a continuous-space image onto a discrete array of pixels.

For anti-aliasing, I’ll make two simplying assumptions (to be revisited):

  • Each pixel is a square area. (With apologies to Alvy Ray Smith.)
  • Since I can choose only one color per pixel, I want exactly the average of the continuous image over pixel’s subregion of 2D space.

The average of a function over a region (here a continuous image over a 2D interval) is equal to the integral of the function across the region divided by the size (area for 2D) of the region. Since our regions are simple squares, the average and the integral can each be defined easily in terms of the other (dividing or multiplying by the size).

To simplify the problem further, I’ll consider one-dimensional integration, i.e., integrating a function of R over a 1D interval. My solution below involves the least upper bound operator I’ve written about (and its specialization unamb).

Continue reading ‘Exact numeric integration’ »

Lazier function definitions by merging partial values

This post continues from an idea of Ryan Ingram’s in an email thread How to make code least strict?.

A pretty story

Pattern matching in function definitions is very handy and has a declarative feel. For instance,

sum []     = 0
sum (x:xs) = x + sum xs

Simply replace “=” by “==” to read such a set of pattern clauses (partial definitions) as a collection of properties specifying a sum function:

  • The sum of an empty list equals zero
  • The sum of a (non-empty) list x:xs equals x plus the sum of the xs.

Moreover, these properties define the sum function, in that sum is the least-defined function that satisfies these two properties.

Guards have a similar style and meaning:

abs x | x < 0 = -x
abs x | x >= 0 =  x

Replacing “=” by “==” and guards by logical implication, we again have two properties that define abs:

x < 0 ==> abs x == -x
x >= 0 ==> abs x ==  x

O, the lies!

This pretty story is a lie, as becomes apparent when we look at overlapping clauses. For instance, we’re more likely to write abs without the second guard:

abs x | x < 0 = -x
abs x          =  x

A declarative of the second clause (∀ x. abs x == x) is false.

I’d more likely write

abs x | x < 0     = -x
      | otherwise =  x

which is all the more deceptive, since “otherwise” doesn’t really mean otherwise. It’s just a synonym for “True“.

Another subtle but common problem arises with definitions like the following, as pointed out by ChrisK in How to make code least strict?:

zip :: [a] -> [b] -> [(a,b)]
zip []      _       = []
zip _       []      = []
zip (x:xs') (y:ys') = (x,y) : zip xs' ys'

These three clauses read like independently true properties for zip. The first two clauses overlap, but their values agree, so what could possibly go wrong with a declarative reading?

The problem is that there are really three flavors of lists, not two. This definition explicitly addresses the nil and cons cases, leaving ⊥.

By the definition above, the value of ‘zip [] ⊥‘ is indeed [], which is consistent with each clause. However, the value of ‘zip ⊥ []‘ is ⊥, because Haskell semantics says that each clause is tried in order, and the first clause forces evaluation of when comparing it with []. This ⊥ value is inconsistent with reading the second clause as a property. Swapping the first two clauses fixes the second example but breaks the first one.

Is it possible to fix zip so that its meaning is consistent with these three properties? We seem to be stuck with an arbitrary bias, with strictness in the first or second argument.

Or are we?

Continue reading ‘Lazier function definitions by merging partial values’ »

Smarter termination for thread racing

I realized in the shower this morning that there’s a serious flaw in my unamb implementation as described in Functional concurrency with unambiguous choice. Here’s the code for racing two computations:

race :: IO a -> IO a -> IO a
a `race` b = do v  < - newEmptyMVar
                ta <- forkPut a v
                tb <- forkPut b v
                x  <- takeMVar  v
                killThread ta
                killThread tb
                return x

forkPut :: IO a -> MVar a -> IO ThreadId
forkPut act v = forkIO ((act >>= putMVar v) `catch` uhandler `catch` bhandler)
 where
   uhandler (ErrorCall "Prelude.undefined") = return ()
   uhandler err                             = throw err
   bhandler BlockedOnDeadMVar               = return ()

The problem is that each of the threads ta and tb may have spawned other threads, directly or indirectly. When I kill them, they don’t get a chance to kill their sub-threads. If the parent thread does get killed, it will most likely happen during the takeMVar.

My first thought was to use some form of garbage collection of threads, perhaps akin to Henry Baker’s paper The Incremental Garbage Collection of Processes. As with memory GC, dropping one consumer would sometimes result is cascading de-allocations. That cascade is missing from my implementation above.

Or maybe there’s a simple and dependable manual solution, enhancing the method above.

I posted a note asking for ideas, and got the following suggestion from Peter Verswyvelen:

I thought that killing a thread was basically done by throwing a ThreadKilled exception using throwTo. Can’t these exception be caught?

In C#/F# I usually use a similar technique: catch the exception that kills the thread, and perform cleanup.

Playing with Peter’s suggestion works out very nicely, as described in this post.

Continue reading ‘Smarter termination for thread racing’ »

Merging partial values

Last year I stumbled across a simple representation for partial information about values, and wrote about it in two posts, A type for partial values and Implementing a type for partial values. Of particular interest is the ability to combine two partial values into one, combining the information present in each one.

More recently, I played with unambiguous choice, described in the previous post.

This post combines these two ideas. It describes how to work with partial values in Haskell natively, i.e., without using any special representation and without the use restrictions of unambiguous choice. I got inspired to try removing those restrictions during stimulating discussions with Thomas Davie, Russell O’Connor others in the #haskell gang.

You can download and play with the library shown described here. There are links and a bit more info on the lub wiki page.

Edits:

Continue reading ‘Merging partial values’ »

Functional concurrency with unambiguous choice

The Reactive library implements functional reactive programming (FRP) in a data-driven and yet purely functional way, on top of a new primitive I call “unambiguous choice”, or unamb. This primitive has simple functional semantics and a concurrent implementation. The point is to allow one to try out two different ways to answer the same question, when it’s not known in advance which one will succeed first, if at all.

This post describes and demonstrates unamb and its implementation.

Continue reading ‘Functional concurrency with unambiguous choice’ »