Last year I stumbled across a simple representation for partial information about values, and wrote about it in two posts, A type for partial values and Implementing a type for partial values. Of particular interest is the ability to combine two partial values into one, combining the information present in each one.
More recently, I played with unambiguous choice, described in the previous post.
This post combines these two ideas. It describes how to work with partial values in Haskell natively, i.e., without using any special representation and without the use restrictions of unambiguous choice. I got inspired to try removing those restrictions during stimulating discussions with Thomas Davie, Russell O’Connor others in the #haskell gang.
You can download and play with the library shown described here. There are links and a bit more info on the lub wiki page.
- 2008-11-22: Fixed link: Implementing a type for partial values
- 2008-11-22: Tidied introduction
Information, more or less
The meanings of programming languages are often defined via a technique called “denotational semantics”. In this style, one specifies a mathematical model, or semantic domain, for the meanings of utterances in the and then writes what looks like a recursive functional program that maps from syntax to semantic domain. That “program” defines the semantic function. Really, there’s one domain and one semantic function each syntactic category. In typed languages like Haskell, every type has an associated semantic domain.
One of the clever ideas of the theory of semantic domains (“domain theory”) is to place a partial ordering on values (domain members),
based on information content.
Values can not only be equal and unequal, they can also have more or less information content than each other.
The value with the least information is at the bottom of this ordering, and so is called “bottom”, often written as “⊥”.
In Haskell, ⊥ is the meaning of “
For succinctness below, I’ll write “⊥” instead of “
undefined” in Haskell code.
Many types have corresponding flat domains, meaning that the only values are either completely undefined completely defined.
For instance, the Haskell type
Integer is (semantically) flat.
Its values are all either ⊥ or integers.
Structured types are not flat.
For instance, the meaning of (i.e., the domain corresponding to) the Haskell type
(Bool,Integer) contains five different kinds of
values, as shown in the figure.
Each arrow leads from a less-defined (less informative) value to a more-defined value.
To handle the diversity of Haskell types, define a class of types for which we know how to compute lubs.
class HasLub a where (⊔) :: a -> a -> a
The actual lub library, uses “
lub” instead of “
The arguments must be consistent, i.e., must have a common upper bound. This precondition is not checked statically or dynamically, so the programmer must take care.
For flat types,
(⊔) is equivalent to
unamb (see Functional concurrency with unambiguous choice), so
-- Flat types: instance HasLub () where (⊔) = unamb instance HasLub Bool where (⊔) = unamb instance HasLub Char where (⊔) = unamb instance HasLub Integer where (⊔) = unamb instance HasLub Float where (⊔) = unamb ...
We can handle pairs easily enough:
instance (HasLub a, HasLub b) => HasLub (a,b) where (a,b) ⊔ (a',b') = (a ⊔ a', b ⊔ b')
but we’d be wrong! This definition is too strict, e.g.,
(1,2) ⊔ ⊥ == ⊥
when we’d want
No problem. Just make the patterns lazy:
instance (HasLub a, HasLub b) => HasLub (a,b) where ~(a,b) ⊔ ~(a',b') = (a ⊔ a', b ⊔ b')
Oops — wrong again. This definition is too lazy:
⊥ ⊔ ⊥ == (⊥,⊥)
when we’d want ⊥.
We can fix the too-lazy version by checking that one of the arguments is non-bottom, which is what
Which one to we check? The one that isn’t
⊥, or either one if they’re both defined.
unamb can manage this task:
instance (HasLub a, HasLub b) => HasLub (a,b) where ~p@(a,b) ⊔ ~p'@(a',b') = (p `unamb` p') `seq` (a ⊔ a', b ⊔ b')
But there’s a catch (which I realized only now):
p' may not satisfy the precondition on
(If they did, we could use
(⊔) = unamb.)
To fix this last problem, check whether each pair is defined.
We can’t know which to check first, so test concurrently, using
instance (HasLub a, HasLub b) => HasLub (a,b) where ~p@(a,b) ⊔ ~p'@(a',b') = (definedP p `unamb` definedP p') `seq` (a ⊔ a', b ⊔ b')
definedP :: (a,b) -> Bool definedP (_,_) = True
The implicit second case in that definition is
definedP ⊥ = ⊥.
*Data.Lub> (⊥,False) ⊔ (True,⊥) (True,False) *Data.Lub> (⊥,(⊥,False)) ⊔ ((),(⊥,⊥)) ⊔ (⊥,(True,⊥)) ((),(True,False))
For sums, we’ll discriminate between lefts and rights, with the following assistants:
isL :: Either a b -> Bool isL = either (const True) (const False) outL :: Either a b -> a outL = either id (error "outL on Right") outR :: Either a b -> b outR = either (error "outR on Left") id
(⊔) method for sums unwraps its arguments as a
Left or as a
Right, after checking which kind of arguments it gets.
We can’t know which one to check first, so check concurrently:
instance (HasLub a, HasLub b) => HasLub (Either a b) where s ⊔ s' = if isL s `unamb` isL s' then Left (outL s ⊔ outL s') else Right (outR s ⊔ outR s')
Exercise: Why is the use of
unamb here legal?
f is said to be less (or equally) defined than a function
f is less (or equally) defined
g for every argument
instance HasLub b => HasLub (a -> b) where f ⊔ g = a -> f a ⊔ g a
instance HasLub b => HasLub (a -> b) where (⊔) = liftA2 (⊔)
We’ve already handled the unit type
(), other flat types, pairs, sums, and functions.
Algebraic data types can be modeled via this standard set, with a technique from generic programming.
Define methods that map to and from a type in the standard set.
class HasRepr t r | t -> r where -- Required: unrepr . repr == id repr :: t -> r -- to repr unrepr :: r -> t -- from repr
We can implement
(⊔) on a type by performing a
(⊔) on that type’s standard representation, as follows:
repLub :: (HasRepr a v, HasLub v) => a -> a -> a a `repLub` a' = unrepr (repr a ⊔ repr a') instance (HasRepr t v, HasLub v) => HasLub t where (⊔) = repLub
However, this rule would overlap with all other
HasLub instances, because Haskell instance selection is based only on the head of an instance definition, i.e., the part after the “
Instead, we’ll define a
HasLub instance per
For instance, here are encodings for
Maybe and for lists:
instance HasRepr (Maybe a) (Either () a) where repr Nothing = (Left ()) repr (Just a) = (Right a) unrepr (Left ()) = Nothing unrepr (Right a) = (Just a) instance HasRepr [a] (Either () (a,[a])) where repr  = (Left ()) repr (a:as) = (Right (a,as)) unrepr (Left ()) =  unrepr (Right (a,as)) = (a:as)
instance HasLub a => HasLub (Maybe a) where (⊔) = repLub instance HasLub a => HasLub [a] where (⊔) = repLub
*Data.Lub> [1,⊥,2] ⊔ [⊥,3,2] [1,3,2]