Archive for 20th January 2009

## Lazier function definitions by merging partial values

This post continues from an idea of Ryan Ingram’s in an email thread How to make code least strict?.

### A pretty story

Pattern matching in function definitions is very handy and has a declarative feel. For instance,

sum []     = 0
sum (x:xs) = x + sum xs

Simply replace “=” by “==” to read such a set of pattern clauses (partial definitions) as a collection of properties specifying a sum function:

• The sum of an empty list equals zero
• The sum of a (non-empty) list x:xs equals x plus the sum of the xs.

Moreover, these properties define the sum function, in that sum is the least-defined function that satisfies these two properties.

Guards have a similar style and meaning:

abs x | x < 0 = -x
abs x | x >= 0 =  x

Replacing “=” by “==” and guards by logical implication, we again have two properties that define abs:

x < 0 ==> abs x == -x
x >= 0 ==> abs x ==  x

### O, the lies!

This pretty story is a lie, as becomes apparent when we look at overlapping clauses. For instance, we’re more likely to write abs without the second guard:

abs x | x < 0 = -x
abs x          =  x

A declarative of the second clause (∀ x. abs x == x) is false.

I’d more likely write

abs x | x < 0     = -x
| otherwise =  x

which is all the more deceptive, since “otherwise” doesn’t really mean otherwise. It’s just a synonym for “True“.

Another subtle but common problem arises with definitions like the following, as pointed out by ChrisK in How to make code least strict?:

zip :: [a] -> [b] -> [(a,b)]
zip []      _       = []
zip _       []      = []
zip (x:xs') (y:ys') = (x,y) : zip xs' ys'

These three clauses read like independently true properties for zip. The first two clauses overlap, but their values agree, so what could possibly go wrong with a declarative reading?

The problem is that there are really three flavors of lists, not two. This definition explicitly addresses the nil and cons cases, leaving ⊥.

By the definition above, the value of ‘zip [] ⊥‘ is indeed [], which is consistent with each clause. However, the value of ‘zip ⊥ []‘ is ⊥, because Haskell semantics says that each clause is tried in order, and the first clause forces evaluation of when comparing it with []. This ⊥ value is inconsistent with reading the second clause as a property. Swapping the first two clauses fixes the second example but breaks the first one.

Is it possible to fix zip so that its meaning is consistent with these three properties? We seem to be stuck with an arbitrary bias, with strictness in the first or second argument.

Or are we?

Continue reading ‘Lazier function definitions by merging partial values’ »