## Fixing broken isomorphisms — details for non-strict memoization, part 2

The post Details for non-strict memoization, part 1 works out a systematic way of doing non-strict memoization, i.e., correct memoization of non-strict (and more broadly, non-hyper-strict) functions. As I mentioned at the end, there was an awkward aspect, which is that the purported “isomorphisms” used for regular types are not quite isomorphisms.

For instance, functions from triples are memoized by converting to and from nested pairs:

``````untriple ∷ (a,b,c) -> ((a,b),c)
untriple (a,b,c) = ((a,b),c)

triple ∷ ((a,b),c) -> (a,b,c)
triple ((a,b),c) = (a,b,c)
``````

Then `untriple` and `triple` form an embedding/projection pair, i.e.,

``````triple ∘ untriple ≡ id
untriple ∘ triple ⊑ id
``````

The reason for the inequality is that the nested-pair form permits `(⊥,c)`, which does not correspond to any triple.

``````untriple (triple (⊥,c)) ≡ untriple ⊥ ≡ ⊥
``````

Can we patch this problem by simply using an irrefutable (lazy) pattern in the definition of `triple`, i.e., `triple (~(a,b),c) = (a,b,c)`? Let’s try:

``````untriple (triple (⊥,c)) ≡ untriple (⊥,⊥,c) ≡ ((⊥,⊥),c)
``````

So isomorphism fails and so does even the embedding/projection property.

Similarly, to deal with regular algebraic data types, I used a class that describes regular data types as repeated applications of a single, associated pattern functor (following A Lightweight Approach to Datatype-Generic Rewriting):

``````class Functor (PF t) ⇒ Regular t where
type PF t ∷ * → *
unwrap ∷ t → PF t t
wrap   ∷ PF t t → t
``````

Here `unwrap` converts a value into its pattern functor form, and `wrap` converts back. For example, here is the `Regular` instance I had used for lists:

``````instance Regular [a] where
type PF [a] = Const () :+: Const a :*: Id

unwrap []     = InL (Const ())
unwrap (a:as) = InR (Const a :*: Id as)

wrap (InL (Const ()))          = []
wrap (InR (Const a :*: Id as)) = a:as
``````

Again, we have an embedding/projection pair, rather than a genuine isomorphism:

``````wrap ∘ unwrap ≡ id
unwrap ∘ wrap ⊑ id
``````

The inequality comes from ⊥ values occurring in `PF [a] [a]` at type `Const () [a]`, `()`, `(Const a :*: Id) [a]`, `Const a [a]`, or `Id [a]`.

Continue reading ‘Fixing broken isomorphisms — details for non-strict memoization, part 2’ »

## Details for non-strict memoization, part 1

In Non-strict memoization, I sketched out a means of memoizing non-strict functions. I gave the essential insight but did not show the details of how a nonstrict memoization library comes together. In this new post, I give details, which are a bit delicate, in terms of the implementation described in Elegant memoization with higher-order types.

Near the end, I run into some trouble with regular data types, which I don’t know how to resolve cleanly and efficiently.

Edits:

• 2010-09-10: Fixed minor typos.

Continue reading ‘Details for non-strict memoization, part 1’ »

## Non-strict memoization

I’ve written a few posts about functional memoization. In one of them, Luke Palmer commented that the memoization methods are correct only for strict functions, which I had not noticed before. In this note, I correct this flaw, extending correct memoization to non-strict functions as well. The semantic notion of least upper bound (which can be built of unambiguous choice) plays a crucial role.

Edits:

• 2010-07-13: Fixed the non-strict memoization example to use an argument of `undefined` (⊥) as intended.
• 2010-07-23: Changed spelling from “nonstrict” to the much more popular “non-strict”.
• 2011-02-16: Fixed minor typo. (“constraint on result” → “constraint on the result type”)

## Exact numeric integration

This post describes a simple way to integrate a function over an interval and get an exact answer. The question came out of another one, which is how to optimally render a continuous-space image onto a discrete array of pixels.

For anti-aliasing, I’ll make two simplying assumptions (to be revisited):

• Each pixel is a square area. (With apologies to Alvy Ray Smith.)
• Since I can choose only one color per pixel, I want exactly the average of the continuous image over pixel’s subregion of 2D space.

The average of a function over a region (here a continuous image over a 2D interval) is equal to the integral of the function across the region divided by the size (area for 2D) of the region. Since our regions are simple squares, the average and the integral can each be defined easily in terms of the other (dividing or multiplying by the size).

To simplify the problem further, I’ll consider one-dimensional integration, i.e., integrating a function of R over a 1D interval. My solution below involves the least upper bound operator I’ve written about (and its specialization `unamb`).

Continue reading ‘Exact numeric integration’ »

## Lazier function definitions by merging partial values

This post continues from an idea of Ryan Ingram’s in an email thread How to make code least strict?.

### A pretty story

Pattern matching in function definitions is very handy and has a declarative feel. For instance,

``````sum []     = 0
sum (x:xs) = x + sum xs
``````

Simply replace “`=`” by “`==`” to read such a set of pattern clauses (partial definitions) as a collection of properties specifying a `sum` function:

• The sum of an empty list equals zero
• The sum of a (non-empty) list `x:xs` equals `x` plus the sum of the `xs`.

Moreover, these properties define the `sum` function, in that `sum` is the least-defined function that satisfies these two properties.

Guards have a similar style and meaning:

``````abs x | x < 0 = -x
abs x | x >= 0 =  x
``````

Replacing “`=`” by “`==`” and guards by logical implication, we again have two properties that define `abs`:

``````x < 0 ==> abs x == -x
x >= 0 ==> abs x ==  x
``````

### O, the lies!

This pretty story is a lie, as becomes apparent when we look at overlapping clauses. For instance, we’re more likely to write `abs` without the second guard:

``````abs x | x < 0 = -x
abs x          =  x
``````

A declarative of the second clause (∀ x. abs x == x) is false.

I’d more likely write

``````abs x | x < 0     = -x
| otherwise =  x
``````

which is all the more deceptive, since “`otherwise`” doesn’t really mean otherwise. It’s just a synonym for “`True`“.

Another subtle but common problem arises with definitions like the following, as pointed out by ChrisK in How to make code least strict?:

``````zip :: [a] -> [b] -> [(a,b)]
zip []      _       = []
zip _       []      = []
zip (x:xs') (y:ys') = (x,y) : zip xs' ys'
``````

These three clauses read like independently true properties for `zip`. The first two clauses overlap, but their values agree, so what could possibly go wrong with a declarative reading?

The problem is that there are really three flavors of lists, not two. This definition explicitly addresses the nil and cons cases, leaving ⊥.

By the definition above, the value of ‘`zip [] ⊥`‘ is indeed `[]`, which is consistent with each clause. However, the value of ‘`zip ⊥ []`‘ is ⊥, because Haskell semantics says that each clause is tried in order, and the first clause forces evaluation of `⊥` when comparing it with `[]`. This ⊥ value is inconsistent with reading the second clause as a property. Swapping the first two clauses fixes the second example but breaks the first one.

Is it possible to fix `zip` so that its meaning is consistent with these three properties? We seem to be stuck with an arbitrary bias, with strictness in the first or second argument.

Or are we?

Continue reading ‘Lazier function definitions by merging partial values’ »

## Smarter termination for thread racing

I realized in the shower this morning that there’s a serious flaw in my unamb implementation as described in Functional concurrency with unambiguous choice. Here’s the code for racing two computations:

``````race :: IO a -> IO a -> IO a
a `race` b = do v  < - newEmptyMVar
ta <- forkPut a v
tb <- forkPut b v
x  <- takeMVar  v
return x

forkPut :: IO a -> MVar a -> IO ThreadId
forkPut act v = forkIO ((act >>= putMVar v) `catch` uhandler `catch` bhandler)
where
uhandler (ErrorCall "Prelude.undefined") = return ()
uhandler err                             = throw err
``````

The problem is that each of the threads `ta` and `tb` may have spawned other threads, directly or indirectly. When I kill them, they don’t get a chance to kill their sub-threads. If the parent thread does get killed, it will most likely happen during the `takeMVar`.

My first thought was to use some form of garbage collection of threads, perhaps akin to Henry Baker’s paper The Incremental Garbage Collection of Processes. As with memory GC, dropping one consumer would sometimes result is cascading de-allocations. That cascade is missing from my implementation above.

Or maybe there’s a simple and dependable manual solution, enhancing the method above.

I posted a note asking for ideas, and got the following suggestion from Peter Verswyvelen:

I thought that killing a thread was basically done by throwing a ThreadKilled exception using throwTo. Can’t these exception be caught?

In C#/F# I usually use a similar technique: catch the exception that kills the thread, and perform cleanup.

Playing with Peter’s suggestion works out very nicely, as described in this post.

## Merging partial values

Last year I stumbled across a simple representation for partial information about values, and wrote about it in two posts, A type for partial values and Implementing a type for partial values. Of particular interest is the ability to combine two partial values into one, combining the information present in each one.

More recently, I played with unambiguous choice, described in the previous post.

This post combines these two ideas. It describes how to work with partial values in Haskell natively, i.e., without using any special representation and without the use restrictions of unambiguous choice. I got inspired to try removing those restrictions during stimulating discussions with Thomas Davie, Russell O’Connor others in the #haskell gang.

The Reactive library implements functional reactive programming (FRP) in a data-driven and yet purely functional way, on top of a new primitive I call “unambiguous choice”, or `unamb`. This primitive has simple functional semantics and a concurrent implementation. The point is to allow one to try out two different ways to answer the same question, when it’s not known in advance which one will succeed first, if at all.
This post describes and demonstrates `unamb` and its implementation.