Lately I’ve been learning that some programming principles I treasure are not widely shared among my Haskell comrades. Or at least not widely among those I’ve been hearing from. I was feeling bummed, so I decided to write this post, in order to help me process the news and to see who resonates with what I’m looking for.
One of the principles I’m talking about is that the value of a closed expression (one not containing free variables) depends solely on the expression itself — not influenced by the dynamic conditions under which it is executed. I relate to this principle as the soul of functional programming and of referential transparency in particular.
- 2009-10-26: Minor typo fix
Recently I encountered two facts about standard Haskell libraries that I have trouble reconciling with this principle.
- The meaning of
Intoperations in overflow situations is machine-dependent. Typically they use 32 bits when running on 32-bit machine and 64 bits when running on 64-bit machines. Implementations are free to use as few as 29 bits. Thus the value of the expression “
2^32 == (0 ::Int)” may be either
True, depending on the dynamic conditions under which it is evaluated.
- The expression “
System.Info.os” has type
String, although its value as a sequence of characters depends on the circumstances of its execution. (Similarly for the other exports from
System.Info. Hm. I just noticed that the module is labeled as “portable”. Typo? Joke?)
Although I’ve been programming primarily in Haskell since around 1995, I didn’t realize that these implementation-dependent meanings were there. As in many romantic relationships, I suppose I’ve been seeing Haskell not as she is, but as I idealized her to be.
There’s another principle that is closely related to the one above and even more fundamental to me: every type has a precise, specific, and preferably simple denotation.
If an expression
e has type
T, then the meaning (value) of
e is a member of the collection denoted by
For instance, I think of the meaning of the type
String, i.e., of
[Char], as being sequences of characters.
Well, not quite that simple, because it also contains some partially defined sequences and has a partial information ordering (non-flat in this case).
Given this second principle, if
os :: String, then the meaning of
os is some sequence of characters.
Assuming the sequence is finite and non-partial, it can be written down as a literal string, and that literal can be substituted for every occurrence of “
os” in a program, without changing the program’s meaning.
os evaluates to “linux” on my machine and evaluates to “darwin” on my friend Bob’s machine, so substituting any literal string for “
os” would change the meaning, as observable on at least one of these machines.
Now I realize I’m really talking about standard Haskell libraries, not Haskell itself. When I discussed my confusion & dismay in the #haskell chat room, someone suggested explaining these semantic differences in terms of different libraries and hence different programs (if one takes programs to include the libraries they use). One would not expect different programs (due to different libraries) to have the same meaning.
I understand this different-library perspective — in a literal way. And yet I’m not really satisfied. What I get is that standard libraries are “standard” in signature (form), not in meaning (substance). With no promises about semantic commonality, I don’t know how standard libraries can be useful.
Another perspective that came up on #haskell was that the kind of semantic consistency I’m looking for is impossible, because of possibilities of failure.
For instance, evaluating an expression might one time fail due to memory exhaustion, while succeeding (perhaps just barely) on another attempt.
After mulling over that point, I’d like to weaken my principle a little.
Instead of asking that all evaluations of an expression yield same value, I ask that all evaluations of an expression yield consistent answers.
By “consistent” I mean in the sense of information content.
Answers don’t have to agree, but they must not disagree.
Failures like exhausted memory are modeled as ⊥, which is called “bottom” because it is the bottom of the information partial ordering.
It contains no information and so is consistent with every value, disagreeing with no value.
More precisely, values are consistent when they have a shared upper (information) bound, and inconsistent when they don’t.
The value ⊥ means i-don’t-know, and the value
(1,⊥,3) means (1, i-don’t-know, 3).
The consistent-value principle accepts possible failures due to finite resources and hardware failure, while rejecting “linux” vs “darwin” for
System.Info.os or False vs True for “
2^32 == (0 ::Int).
It also accepts
System.Info.os :: IO String, which is the type I would have expected, because the semantics of
IO String is big enough to accommodate dependence on dynamic conditions.
If you also cherish the principles I mention above, I’d love to hear from you.