<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Conal Elliott &#187; semantics</title>
	<atom:link href="http://conal.net/blog/tag/semantics/feed" rel="self" type="application/rss+xml" />
	<link>http://conal.net/blog</link>
	<description>Inspirations &#38; experiments, mainly about denotative/functional programming in Haskell</description>
	<lastBuildDate>Thu, 25 Jul 2019 18:15:11 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=4.1.17</generator>
	<atom:link rel="payment" title="Flattr this!" href="https://flattr.com/submit/auto?user_id=conal&amp;popout=1&amp;url=http%3A%2F%2Fconal.net%2Fblog%2F&amp;language=en_US&amp;category=text&amp;title=Conal+Elliott&amp;description=Inspirations+%26amp%3B+experiments%2C+mainly+about+denotative%2Ffunctional+programming+in+Haskell&amp;tags=blog" type="text/html" />
	<item>
		<title>Garbage collecting the semantics of FRP</title>
		<link>http://conal.net/blog/posts/garbage-collecting-the-semantics-of-frp</link>
		<comments>http://conal.net/blog/posts/garbage-collecting-the-semantics-of-frp#comments</comments>
		<pubDate>Mon, 04 Jan 2010 21:55:30 +0000</pubDate>
		<dc:creator><![CDATA[Conal]]></dc:creator>
				<category><![CDATA[Functional programming]]></category>
		<category><![CDATA[derivative]]></category>
		<category><![CDATA[design]]></category>
		<category><![CDATA[FRP]]></category>
		<category><![CDATA[functional reactive programming]]></category>
		<category><![CDATA[semantics]]></category>

		<guid isPermaLink="false">http://conal.net/blog/?p=96</guid>
		<description><![CDATA[Ever since ActiveVRML, the model we&#8217;ve been using in functional reactive programming (FRP) for interactive behaviors is (T-&#62;a) -&#62; (T-&#62;b), for dynamic (time-varying) input of type a and dynamic output of type b (where T is time). In &#8220;Classic FRP&#8221; formulations (including ActiveVRML, Fran &#38; Reactive), there is a &#8220;behavior&#8221; abstraction whose denotation is a [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- 

Title: Garbage collecting the semantics of FRP

Tags: FRP, functional reactive programming, semantics, design, derivative

URL: http://conal.net/blog/posts/garbage-collecting-the-semantics-of-frp/

-->

<!-- references -->

<!-- teaser -->

<p>Ever since <a href="http://conal.net/papers/ActiveVRML/" title="Tech report: &quot;A Brief Introduction to ActiveVRML&quot;">ActiveVRML</a>, the model we&#8217;ve been using in functional reactive programming (FRP) for interactive behaviors is <code>(T-&gt;a) -&gt; (T-&gt;b)</code>, for dynamic (time-varying) input of type <code>a</code> and dynamic output of type <code>b</code> (where <code>T</code> is time).
In &#8220;Classic FRP&#8221; formulations (including <a href="http://conal.net/papers/ActiveVRML/" title="Tech report: &quot;A Brief Introduction to ActiveVRML&quot;">ActiveVRML</a>, <a href="http://conal.net/papers/icfp97/" title="paper">Fran</a> &amp; <a href="http://conal.net/papers/push-pull-frp/" title="Paper by Conal Elliott and Paul Hudak">Reactive</a>), there is a &#8220;behavior&#8221; abstraction whose denotation is a function of time.
Interactive behaviors are then modeled as host language (e.g., Haskell) functions between behaviors.
Problems with this formulation are described in <em><a href="http://conal.net/blog/posts/why-classic-FRP-does-not-fit-interactive-behavior/" title="blog post">Why classic FRP does not fit interactive behavior</a></em>.
These same problems motivated &#8220;Arrowized FRP&#8221;.
In Arrowized FRP, behaviors (renamed &#8220;signals&#8221;) are purely conceptual.
They are part of the semantic model but do not have any realization in the programming interface.
Instead, the abstraction is a <em>signal transformer</em>, <code>SF a b</code>, whose semantics is <code>(T-&gt;a) -&gt; (T-&gt;b)</code>.
See <em><a href="http://conal.net/papers/genuinely-functional-guis.pdf" title="Paper by Antony Courtney and Conal Elliott">Genuinely Functional User Interfaces</a></em> and <em><a href="http://www.haskell.org/yale/papers/haskellworkshop02/" title="Paper by Henrik Nilsson, Antony Courtney, and John Peterson">Functional Reactive Programming, Continued</a></em>.</p>

<p>Whether in its classic or arrowized embodiment, I&#8217;ve been growing uncomfortable with this semantic model of functions between time functions.
A few weeks ago, I realized that one source of discomfort is that this model is <em>mostly junk</em>.</p>

<p>This post contains some partially formed thoughts about how to eliminate the junk (&#8220;garbage collect the semantics&#8221;), and what might remain.</p>

<!--
**Edits**:

* 2009-02-09: just fiddling around
-->

<!-- without a comment or something here, the last item above becomes a paragraph -->

<p><span id="more-96"></span></p>

<p>There are two generally desirable properties for a denotational semantics: <em>full abstraction</em> and <em>junk-freeness</em>.
Roughly, &#8220;full abstraction&#8221; means we must not distinguish between what is (operationally) indistinguishable, while &#8220;junk-freeness&#8221; means that every semantic value must be denotable.</p>

<p>FRP&#8217;s semantic model, <code>(T-&gt;a) -&gt; (T-&gt;b)</code>, allows not only arbitrary (computable) transformation of input values, but also of time.
The output at some time can depend on the input at any time at all, or even on the input at arbitrarily many different times.
Consequently, this model allows respoding to <em>future</em> input, violating a principle sometimes called &#8220;causality&#8221;, which is that outputs may depend on the past or present but not the future.</p>

<p>In a causal system, the present can reach backward to the past but not forward the future.
I&#8217;m uneasy about this ability as well.
Arbitrary access to the past may be much more powerful than necessary.
As evidence, consult the system we call (physical) Reality.
As far as I can tell, Reality operates without arbitrary access to the past or to the future, and it does a pretty good job at expressiveness.</p>

<p>Moreover, arbitrary past access is also problematic to implement in its semantically simple generality.</p>

<p>There is a thing we call informally &#8220;memory&#8221;, which at first blush may look like access to the past, it isn&#8217;t really.
Rather, memory is access to a <em>present</em> input, which has come into being through a process of filtering, gradual accumulation, and discarding (forgetting).
I&#8217;m talking about &#8220;memory&#8221; here in the sense of what our brains do, but also what all the rest of physical reality does.
For instance, weather marks on a rock are part of the rock&#8217;s (present) memory of the past weather.</p>

<p>A very simple memory-less semantic model of interactive behavior is just <code>a -&gt; b</code>.
This model is too restrictive, however, as it cannot support <em>any</em> influence of the past on the present.</p>

<p>Which leaves a question: what is a simple and adequate formal model of interactive behavior that reaches neither into the past nor into the future, and yet still allows the past to influence the present?
Inspired in part by a design principle I call &#8220;what would reality do?&#8221; (WWRD), I&#8217;m happy to have some kind of infinitesimal access to the past, but nothing further.</p>

<p>My current intuition is that differentiation/integration plays a crucial role.
That information is carried forward moment by moment in time as &#8220;momentum&#8221; in some sense.</p>

<blockquote>
  <p><em>I call intuition cosmic fishing. You feel a nibble, then you&#8217;ve got to hook the fish.</em> &#8211; Buckminster Fuller</p>
</blockquote>

<p>Where to go with these intuitions?</p>

<p>Perhaps interactive behaviors are some sort of function with all of its derivatives.
See <em><a href="http://conal.net/blog/posts/beautiful-differentiation/" title="blog post">Beautiful differentiation</a></em> for an specification and derived implementation of numeric operations, and more generally of <code>Functor</code> and <code>Applicative</code>, on which much of FRP is based.</p>

<p>I suspect the whole event model can be replaced by integration.
Integration is the main remaining piece.</p>

<p>How weak a semantic model can let us define integration?</p>

<h3>Thanks</h3>

<p>My thanks to Luke Palmer and to Noam Lewis for some clarifying chats about these half-baked ideas.
And to the folks on #haskell IRC for <a href="http://tunes.org/~nef/logs/haskell/10.01.04">brainstorming titles for this post</a>.
My favorite suggestions were</p>

<ul>
<li>luqui: instance HasJunk FRP where</li>
<li>luqui: Functional reactive programming&#8217;s semantic baggage</li>
<li>sinelaw: FRP, please take out the trash!</li>
<li>cale: Garbage collecting the semantics of FRP</li>
<li>BMeph: Take out the FRP-ing Trash</li>
</ul>

<p>all of which I preferred over my original &#8220;FRP is mostly junk&#8221;.</p>
<p><a href="http://conal.net/blog/?flattrss_redirect&amp;id=96&amp;md5=a0b309c313791bd63f34ab08b5fb4c3b"><img src="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png" srcset="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@2x.png 2xhttp://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@3x.png 3x" alt="Flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://conal.net/blog/posts/garbage-collecting-the-semantics-of-frp/feed</wfw:commentRss>
		<slash:comments>34</slash:comments>
		<atom:link rel="payment" title="Flattr this!" href="https://flattr.com/submit/auto?user_id=conal&amp;popout=1&amp;url=http%3A%2F%2Fconal.net%2Fblog%2Fposts%2Fgarbage-collecting-the-semantics-of-frp&amp;language=en_GB&amp;category=text&amp;title=Garbage+collecting+the+semantics+of+FRP&amp;description=Ever+since+ActiveVRML%2C+the+model+we%26%238217%3Bve+been+using+in+functional+reactive+programming+%28FRP%29+for+interactive+behaviors+is+%28T-%26gt%3Ba%29+-%26gt%3B+%28T-%26gt%3Bb%29%2C+for+dynamic+%28time-varying%29+input+of+type+a+and+dynamic+output...&amp;tags=derivative%2Cdesign%2CFRP%2Cfunctional+reactive+programming%2Csemantics%2Cblog" type="text/html" />
	</item>
		<item>
		<title>Thoughts on semantics for 3D graphics</title>
		<link>http://conal.net/blog/posts/thoughts-on-semantics-for-3d-graphics</link>
		<comments>http://conal.net/blog/posts/thoughts-on-semantics-for-3d-graphics#comments</comments>
		<pubDate>Mon, 23 Nov 2009 07:41:30 +0000</pubDate>
		<dc:creator><![CDATA[Conal]]></dc:creator>
				<category><![CDATA[Functional programming]]></category>
		<category><![CDATA[3D]]></category>
		<category><![CDATA[arrow]]></category>
		<category><![CDATA[design]]></category>
		<category><![CDATA[geometry]]></category>
		<category><![CDATA[semantics]]></category>
		<category><![CDATA[type class morphism]]></category>

		<guid isPermaLink="false">http://conal.net/blog/?p=90</guid>
		<description><![CDATA[The central question for me in designing software is always What does it mean? With functional programming, this question is especially crisp. For each data type I define, I want to have a precise and simple mathematical model. (For instance, my model for behavior is function-of-time, and my model of images is function-of-2D-space.) Every operation [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- 

Title: Thoughts on semantics for 3D graphics

Tags: semantics, design, 3D, arrow, type class morphism, geometry

URL: http://conal.net/blog/posts/thoughts-on-semantics-for-3d-graphics/

-->

<!-- references -->

<!-- teaser -->

<p>The central question for me in designing software is always</p>

<blockquote>
  <p>What does it mean?</p>
</blockquote>

<p>With functional programming, this question is especially crisp.
For each data type I define, I want to have a precise and simple mathematical model.
(For instance, my model for behavior is function-of-time, and my model of images is function-of-2D-space.)
Every operation on the type is also given a meaning in terms of that semantic model.</p>

<p>This specification process, which is denotational semantics applied to data types, provides a basis for</p>

<ul>
<li>correctness of the implementation,</li>
<li>user documentation free of implementation detail,</li>
<li>generating and proving properties, which can then be used in automated testing, and</li>
<li>evaluating and comparing the elegance and expressive power of design decisions.</li>
</ul>

<p>For an example (2D images), some motivation of this process, and discussion, see Luke Palmer&#8217;s post <em><a href="http://lukepalmer.wordpress.com/2008/07/18/semantic-design/" title="Blog post by Luke Palmer">Semantic Design</a></em>.
See also my posts on the idea and use of <em><a href="http://conal.net/blog/tag/type-class-morphism/" title="Posts on type class morphisms">type class morphisms</a></em>, which provide additional structure to denotational design.</p>

<p>In spring of 2008, I started working on a functional 3D library, <a href="http://haskell.org/haskellwiki/FieldTrip" title="Library wiki page">FieldTrip</a>.
I&#8217;ve designed functional 3D libraries before as part of <a href="http://conal.net/tbag/" title="Project web page">TBAG</a>, <a href="http://conal.net/papers/ActiveVRML/" title="Tech report: &quot;A Brief Introduction to ActiveVRML&quot;">ActiveVRML</a>, and <a href="http://conal.net/Fran" title="Functional reactive animation">Fran</a>.
This time I wanted a semantics-based design, for all of the reasons given above.
As always, I want a model that is</p>

<ul>
<li>simple, </li>
<li>elegant, and </li>
<li>general.</li>
</ul>

<p>For 3D, I also want the model to be GPU-friendly, i.e., to execute well on (modern) GPUs and to give access to their abilities.</p>

<p>I hadn&#8217;t thought of or heard a model that I was happy with, and so I didn&#8217;t have the sort of firm ground I like to stand on in working on FieldTrip.
Last February, such a model occurred to me.
I&#8217;ve had this blog post mostly written since then.
Recently, I&#8217;ve been focused on functional 3D again for GPU-based rendering, and then Sean McDirmid <a href="http://mcdirmid.wordpress.com/2009/11/20/designing-a-gpu-oriented-geometry-abstraction-part-one/">posed a similar question</a>, which got me thinking again.</p>

<!--
**Edits**:

* 2008-02-09: just fiddling around
-->

<!-- without a comment or something here, the last item above becomes a paragraph -->

<p><span id="more-90"></span></p>

<h3>Geometry</h3>

<p>3D graphics involves a variety of concepts.
Let&#8217;s start with 3D geometry, using a <em>surface</em> (rather than a <em>solid</em>) model.</p>

<p>Examples of 3D (surface) geometry include</p>

<ul>
<li>the boundary (surface) of a solid box, sphere, or torus,</li>
<li>a filled triangle, rectangle, or circle,</li>
<li>a collection of geometry , and</li>
<li>a spatial transformation of geometry.</li>
</ul>

<h4>First model: set of geometric primitives</h4>

<p>One model of geometry is a set of geometric primitives.
In this model, <code>union</code> means set union, and spatial transformation means transforming all of the 3D points in all of the primitives in the set.
Primitives contain infinitely (even uncountably) many points, so that&#8217;s a lot of transforming.
Fortunately, we&#8217;re talking about what (semantics), and not how (implementation).</p>

<p><em>What is a geometric primitive?</em></p>

<p>We could say it&#8217;s a triangle, specified by three coordinates.
After all, computer graphics reduces everything to sets of triangles.
Oops &#8212; we&#8217;re confusing semantics and implementation.
Tessellation <em>approximates</em> curved surfaces by sets of triangles but loses information in the process.
I want a story that includes this approximation process but keeps it clearly distinct from semantically ideal curved surfaces.
Then users can work with the ideal, simple semantics and rely on the implementation to perform intelligent, dynamic, view-dependent tessellation that adapts to available hardware resources.</p>

<p>Another model of geometric primitive is a function from 2D space to 3D space, i.e., the &#8220;parametric&#8221; representation of surfaces.
Along with the function, we&#8217;ll probably want some means of describing the subset of 2D over which the surface is defined, so as to trim our surfaces.
A simple formalization would be</p>

<pre><code>type Surf = R2 -&gt; Maybe R3
</code></pre>

<p>where</p>

<pre><code>type R  -- real numbers
type R2 = (R,R)
type R3 = (R,R,R)
</code></pre>

<p>For shading, we&#8217;ll also need normals, and possibly tangents &amp; bitangents,
We can get these features and more by including derivatives, either just first derivatives or all of them.
See my <a href="http://conal.net/blog/tag/derivative/" title="Posts on derivatives">posts on derivatives</a> and paper <em><a href="http://conal.net/papers/beautiful-differentiation/" title="Paper: Beautiful differentiation">Beautiful differentiation</a></em>.</p>

<p>In addition to position and derivatives, each point on a primitive also has material properties, which determines how light is reflected by and transmitted through the surface at the point.</p>

<pre><code>type Surf = R2 -&gt; Maybe (R2 :&gt; R3, Material)
</code></pre>

<p>where <code>a :&gt; b</code> contains all derivatives (including zeroth) at a point of a function of type <code>a-&gt;b</code>.
See <em><a href="http://conal.net/blog/posts/higher-dimensional-higher-order-derivatives-functionally/" title="blog post">Higher-dimensional, higher-order derivatives, functionally</a></em>.
We could perhaps also include derivatives of material properties:</p>

<pre><code>type Surf = R2 :~&gt; Maybe (R3, Material)
</code></pre>

<p>where <code>a :~&gt; b</code> is the type of infinitely differentiable functions.</p>

<h4>Combining geometry values</h4>

<p>The <code>union</code> function gives one way to combine two geometry values.
Another is morphing (interpolation) of positions and of material properties.
What can the semantics of morphing be?</p>

<p>Morphing betwen two <em>surfaces</em> is easier to define.
A surface is a function, so we can interpolate <em>point-wise</em>: given surfaces <code>r</code> and <code>s</code>, for each point <code>p</code> in parameter space, interpolate between (a) <code>r</code> at <code>p</code> and (b) <code>s</code> at <code>p</code>, which is what <code>liftA2</code> (on functions) would suggest.</p>

<p>This definition works <em>if</em> we have a way to interpolate between <code>Maybe</code> values.
If we use <code>liftA2</code> again, now on <code>Maybe</code> values, then the <code>Just</code>/<code>Nothing</code> (and <code>Nothing</code>/<code>Just</code>) cases will yield <code>Nothing</code>.
Is this semantics desirable?
As an example, consider a flat square surface with hole in the middle.
One square has a small hole, and the other has a big hole.
If the size of the hole corresponds to size of the portion of parameter space mapped to <code>Nothing</code>, then point-wise interpolation will always yield the larger hole, rather than interpolating between hole sizes.
On the other hand, the two surfaces with holes might be <code>Just</code> over exactly the same set of parameters, with the function determining how much the <code>Just</code> space gets stretched.</p>

<p>One way to characterize this awkwardness of morphing is that the two functions (surfaces) might have <em>different domains</em>.
This interpretation comes from seeing <code>a -&gt; Maybe b</code> as encoding a function from a <em>subset</em> of <code>a</code> (i.e., a <em>partial</em> function on <code>a</code>).</p>

<p>Even if we had a satisfactory way to combine surfaces (point-wise), how could we extend it to combining full geometry values, which can contain any number of surfaces?
One idea is to model geometry as an <em>structured</em> collection of surfaces, e.g., a list.
Then we could combine the collections element-wise.
Again, we&#8217;d have to deal with the possibility that the collections do not match up.</p>

<h3>Surface tuples</h3>

<p>Let&#8217;s briefly return to a simpler model of surfaces:</p>

<pre><code>type Surf = R2 -&gt; R3
</code></pre>

<p>We could represent a collection of such surfaces as a structured collection, e.g., a list:</p>

<pre><code>type Geometry = [Surf]
</code></pre>

<p>But then the type doesn&#8217;t capture the number of surfaces, leading to mismatches when combining geometry values point-wise.</p>

<p>Alternatively, we could make the number of surfaces explicit in the type, via tuples, possibly nested.
For instance, two surfaces would have type <code>(Surf,Surf)</code>.</p>

<p>Interpolation in this model becomes very simple.
A general interpolator works on vector spaces:</p>

<pre><code>lerp :: VectorSpace v =&gt; v -&gt; v -&gt; Scalar v -&gt; v
lerp a b t = a ^+^ t*^(b ^-^ a)
</code></pre>

<p>or on affine spaces:</p>

<pre><code>alerp :: (AffineSpace p, VectorSpace (Diff p)) =&gt;
         p -&gt; p -&gt; Scalar (Diff p) -&gt; p
alerp p p' s = p .+^ s*^(p' .-. p)
</code></pre>

<p>Both definitions are in the <a href="http://haskell.org/haskellwiki/vector-space" title="Library wiki page">vector-space</a> package.
That package also includes <code>VectorSpace</code> and <code>AffineSpace</code> instances for both functions and tuples.
These instances, together with instances for real values suffice to make (possibly nested) tuples of surfaces be vector spaces and affine spaces.</p>

<h3>From products to sums</h3>

<p>Function pairing admits some useful isomorphisms.
One replaces a product with a product:</p>

<!-- $$(a to b) times (a to c) cong a to (b times c)$$ -->

<pre><code>(a → b) × (a → c) ≅ a → (b × c)
</code></pre>

<p>Using this product/product isomorphism, we could replace tuples of surfaces with a single function from <em>R<sup>2</sup></em> to tuples of <em>R<sup>3</sup></em>.</p>

<p>There is also a handy isomorphism that relates products to sums, in the context of functions:</p>

<!-- $$(b to a) times (c to a) cong (b + c) to a$$ -->

<pre><code>(b → a) × (c → a) ≅ (b + c) → a
</code></pre>

<p>This second isomorphism lets us replace tuples of surfaces with a single &#8220;surface&#8221;, if we generalize the notion of surface to include domains more complex than <em>R<sup>2</sup></em>.</p>

<p>In fact, these two isomorphisms are uncurried forms of the general and useful Haskell functions <code>(&amp;&amp;&amp;)</code> and <code>(|||)</code>, defined on arrows:</p>

<pre><code>(&amp;&amp;&amp;) :: Arrow       (~&gt;) =&gt; (a ~&gt; b) -&gt; (a ~&gt; c) -&gt; (a ~&gt; (b,c))
(|||) :: ArrowChoice (~&gt;) =&gt; (a ~&gt; c) -&gt; (b ~&gt; c) -&gt; (Either a b ~&gt; c)
</code></pre>

<p>Restricted to the function arrow, <code>(|||) == either</code>.</p>

<p>The second isomorphism, <code>uncurry (|||)</code>, has another benefit.
Relaxing the domain type to allow sums opens the way to other domain variations as well.
For instance, we can have types for triangular domains, shapes with holes, and other flavors of bounded and unbounded parameter spaces.
All of these domains are two-dimensional, although they may result from several patches.</p>

<p>Our <code>Geometry</code> type now becomes parameterized:</p>

<pre><code>type Geometry a = a -&gt; (R3,Material)
</code></pre>

<p>The first isomorphism, <code>uncurry (&amp;&amp;&amp;)</code>, is also useful in a geometric setting.
Think of each component of the range type (here <code>R3</code> and <code>Material</code>) as a surface &#8220;attribute&#8221;.
Then <code>(&amp;&amp;&amp;)</code> merges two compatible geometries, including attributes from each.
Attributes could include position (and derivatives) and shading-related material, as well as non-visual properties like temperature, elasticity, stickiness, etc.</p>

<p>With this flexibility in mind, <code>Geometry</code> gets a second type parameter, which is the range type.
Now there&#8217;s nothing left of the <code>Geometry</code> type but general functions:</p>

<pre><code>type Geometry = (-&gt;)
</code></pre>

<p>Recall that we&#8217;re looking for a <em>semantics</em> for 3D geometry.
The <em>type</em> for <code>Geometry</code> might be abstract, with <code>(-&gt;)</code> being its semantic model.
In that case, the model suggests that <code>Geometry</code> have all of the same type class instances that <code>(-&gt;)</code> (and its full or partial applications) has, including <code>Monoid</code>, <code>Functor</code>, <code>Applicative</code>, <code>Monad</code>, and <code>Arrow</code>.
The semantics of these instances would be given by the corresponding instances for <code>(-&gt;)</code>.
(See posts on <a href="http://conal.net/blog/tag/type-class-morphism/" title="Posts on type class morphisms">type class morphisms</a> and the paper <em><a href="http://conal.net/blog/posts/denotational-design-with-type-class-morphisms/" title="blog post">Denotational design with type class morphisms</a></em>.)</p>

<p>Or drop the notion of <code>Geometry</code> altogether and use functions directly.</p>

<h3>Domains</h3>

<p>I&#8217;m happy with the simplicity of geometry as functions.
Functions fit the flexibility of programmable GPUs, and they provide simple, powerful &amp; familiar notions of attribute merging (<code>(&amp;&amp;&amp;)</code>) and union (<code>(|||)</code>/<code>either</code>).</p>

<p>The main question I&#8217;m left with: what are the domains?</p>

<p>One simple domain is a one-dimensional interval, say [-1,1].</p>

<p>Two useful domain building blocks are sum and product.
I mentioned sum above, in connection with geometric union (<code>(|||)</code>/<code>either</code>)
Product combines domains into higher-dimensional domains.
For instance, the product of two 1D intervals is a 2D interval (axis-aligned filled rectangle), which is handy for some parametric surfaces.</p>

<p>What about other domains, e.g., triangular, or having one more holes?  Or multi-way branching surfaces?  Or unbounded?</p>

<p>One idea is to stitch together simple domains using sum.
We don&#8217;t have to build any particular spatial shapes or sizes, since the &#8220;geometry&#8221; functions themselves yield the shape and size.
For instance, a square region can be mapped to a triangular or even circular region.
An infinite domain can be stitched together from infinitely many finite domains.
Or it can be mapped to from a single finite domain.
For instance, the function <code>x -&gt; x / abs (1-x)</code> maps [-1,1] to [-∞,∞].</p>

<p>Alternatively, we could represent domains as typed predicates (characteristic functions).
For instance, the closed interval [-1,1] would be <code>x -&gt; abs x &lt;= 1</code>.
Replacing <code>abs</code> with <code>magnitude</code> (for <a href="http://hackage.haskell.org/packages/archive/vector-space/latest/doc/html/Data-VectorSpace.html#t%3AInnerSpace">inner product spaces</a>), generalizes this formulation to encompass [-1,1] (1D), a unit disk (2D), and a unit ball (3D).</p>

<p>I like the simple generality of the predicate approach, while I like how the pure type approach supports interpolation and other pointwise operations (via <code>liftA2</code> etc).</p>

<h3>Tessellation</h3>

<p>I&#8217;ve intentionally formulated the graphics semantics over continuous space, which makes it resolution-independent and easy to compose.
(This formulation is typical for 3D geometry and 2D vector graphics.
The benefits of continuity apply to generally <a href="http://conal.net/Pan/Gallery" title="Pan image gallery">imagery</a> and to <a href="http://conal.net/Fran/tutorial.htm" title="Animated tutorial: &quot;Composing Reactive Animations&quot;">animation/behavior</a>.)</p>

<p>Graphics hardware specializes in finite collections of triangles.
For rendering, curved surfaces have to be <em>tessellated</em>, i.e., approximated as collections of triangles.
Desirable choice of tessellation depends on characteristics of the surface and of the view, as well as scene complexity and available CPU and GPU resources.
Formulating geometry in its ideal curved form allows for automated analysis and choice of tessellation.
For instance, since triangles are linear, the error of a triangle relative to the surface it approximates depends on how <em>non-linear</em> the surface is over the subset of its domain corresponding to the triangle.
Using <a href="http://en.wikipedia.org/wiki/Talk:Interval_arithmetic" title="Wikipedia page on interval analysis/arithmetic">interval analysis</a> and <a href="http://conal.net/blog/tag/derivative/" title="Posts on derivatives">derivatives</a>, non-linearity can be measured as a size bound on the second derivative or a range of first derivative.
Error could also be analyzed in terms of the resulting image rather than the surface.</p>

<p>For a GPU-based implementation, one could tessellate dynamically, in a &#8220;geometry shader&#8221; or (I presume) in a more general framework like CUDA or OpenCL.</p>

<h3>Abstractness</h3>

<p>A denotational model is &#8220;fully abstract&#8221; when it equates observationally equivalent terms.
The parametric model of surfaces is not fully abstract in that reparameterizing a surface yields a different function that looks the same as a surface.
(Surface reparametrization alters the relationship between domain and range, while covering exactly the same surface, geometrically.)
Properties that are independent of particular parametrization are called &#8220;geometric&#8221;, which I think corresponds to full abstraction (considering those properties as semantic functions).</p>

<p>What might a fully abstract (geometric) model for geometry be?</p>
<p><a href="http://conal.net/blog/?flattrss_redirect&amp;id=90&amp;md5=3a9d3f59c6f8c7d6110dbfab2555bddf"><img src="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png" srcset="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@2x.png 2xhttp://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@3x.png 3x" alt="Flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://conal.net/blog/posts/thoughts-on-semantics-for-3d-graphics/feed</wfw:commentRss>
		<slash:comments>18</slash:comments>
		<atom:link rel="payment" title="Flattr this!" href="https://flattr.com/submit/auto?user_id=conal&amp;popout=1&amp;url=http%3A%2F%2Fconal.net%2Fblog%2Fposts%2Fthoughts-on-semantics-for-3d-graphics&amp;language=en_GB&amp;category=text&amp;title=Thoughts+on+semantics+for+3D+graphics&amp;description=The+central+question+for+me+in+designing+software+is+always+What+does+it+mean%3F+With+functional+programming%2C+this+question+is+especially+crisp.+For+each+data+type+I+define%2C+I+want...&amp;tags=3D%2Carrow%2Cdesign%2Cgeometry%2Csemantics%2Ctype+class+morphism%2Cblog" type="text/html" />
	</item>
		<item>
		<title>Notions of purity in Haskell</title>
		<link>http://conal.net/blog/posts/notions-of-purity-in-haskell</link>
		<comments>http://conal.net/blog/posts/notions-of-purity-in-haskell#comments</comments>
		<pubDate>Mon, 30 Mar 2009 19:00:48 +0000</pubDate>
		<dc:creator><![CDATA[Conal]]></dc:creator>
				<category><![CDATA[Functional programming]]></category>
		<category><![CDATA[purity]]></category>
		<category><![CDATA[referential transparency]]></category>
		<category><![CDATA[semantics]]></category>

		<guid isPermaLink="false">http://conal.net/blog/?p=86</guid>
		<description><![CDATA[Lately I&#8217;ve been learning that some programming principles I treasure are not widely shared among my Haskell comrades. Or at least not widely among those I&#8217;ve been hearing from. I was feeling bummed, so I decided to write this post, in order to help me process the news and to see who resonates with what [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- 

Title: Notions of purity in Haskell

Tags: purity, referential transparency, semantics

URL: http://conal.net/blog/posts/notions-of-purity-in-haskell/

-->

<!-- references -->

<!-- teaser -->

<p>Lately I&#8217;ve been learning that some programming principles I treasure are not widely shared among my Haskell comrades.
Or at least not widely among those I&#8217;ve been hearing from.
I was feeling bummed, so I decided to write this post, in order to help me process the news and to see who resonates with what I&#8217;m looking for.</p>

<p>One of the principles I&#8217;m talking about is that the value of a closed expression (one not containing free variables) depends solely on the expression itself &#8212; not influenced by the dynamic conditions under which it is executed.
I relate to this principle as the soul of functional programming and of referential transparency in particular.</p>

<p><strong>Edits</strong>:</p>

<ul>
<li>2009-10-26: Minor typo fix</li>
</ul>

<!-- without a comment or something here, the last item above becomes a paragraph -->

<p><span id="more-86"></span></p>

<p>Recently I encountered two facts about standard Haskell libraries that I have trouble reconciling with this principle.</p>

<ul>
<li>The meaning of <code>Int</code> operations in overflow situations is machine-dependent.  Typically they use 32 bits when running on 32-bit machine and 64 bits when running on 64-bit machines.  Implementations are free to use as few as 29 bits.  Thus the value of the expression &#8220;<code>2^32 == (0 ::Int)</code>&#8221; may be either <code>False</code> or <code>True</code>, depending on the dynamic conditions under which it is evaluated.</li>
<li>The expression &#8220;<code>System.Info.os</code>&#8221; has type <code>String</code>, although its value as a sequence of characters depends on the circumstances of its execution.  (Similarly for the other exports from <a href="http://haskell.org/ghc/docs/latest/html/libraries/base/System-Info.html"><code>System.Info</code></a>.  Hm.  I just noticed that the module is labeled as &#8220;portable&#8221;.  Typo?  Joke?)</li>
</ul>

<p>Although I&#8217;ve been programming primarily in Haskell since around 1995, I didn&#8217;t realize that these implementation-dependent meanings were there.
As in many romantic relationships, I suppose I&#8217;ve been seeing Haskell not as she is, but as I idealized her to be.</p>

<p>There&#8217;s another principle that is closely related to the one above and even more fundamental to me: every type has a precise, specific, and preferably simple denotation.
If an expression <code>e</code> has type <code>T</code>, then the meaning (value) of <code>e</code> is a member of the collection denoted by <code>T</code>.
For instance, I think of the meaning of the type <code>String</code>, i.e., of <code>[Char]</code>, as being sequences of characters.
Well, not quite that simple, because it also contains some partially defined sequences and has a partial information ordering (non-flat in this case).
Given this second principle, if <code>os :: String</code>, then the meaning of <code>os</code> is some sequence of characters.
Assuming the sequence is finite and non-partial, it can be written down as a literal string, and that literal can be substituted for every occurrence of &#8220;<code>os</code>&#8221; in a program, without changing the program&#8217;s meaning.
However, <code>os</code> evaluates to &#8220;linux&#8221; on my machine and evaluates to &#8220;darwin&#8221; on my friend Bob&#8217;s machine, so substituting <em>any</em> literal string for &#8220;<code>os</code>&#8221; would change the meaning, as observable on at least one of these machines.</p>

<p>Now I realize I&#8217;m really talking about standard Haskell <em>libraries</em>, not Haskell itself.
When I <a href="http://tunes.org/~nef/logs/haskell/09.03.29">discussed my confusion &amp; dismay in the #haskell chat room</a>, someone suggested explaining these semantic differences in terms of different libraries and hence different programs (if one takes programs to include the libraries they use).
One would not expect different programs (due to different libraries) to have the same meaning.</p>

<p>I understand this different-library perspective &#8212; in a literal way.
And yet I&#8217;m not really satisfied.
What I get is that standard libraries are &#8220;standard&#8221; in signature (form), not in meaning (substance).
With no promises about semantic commonality, I don&#8217;t know how standard libraries can be useful.</p>

<p>Another perspective that came up on #haskell was that the kind of semantic consistency I&#8217;m looking for is <em>impossible</em>, because of possibilities of failure.
For instance, evaluating an expression might one time fail due to memory exhaustion, while succeeding (perhaps just barely) on another attempt.
After mulling over that point, I&#8217;d like to weaken my principle a little.
Instead of asking that all evaluations of an expression yield <em>same</em> value, I ask that all evaluations of an expression yield <em>consistent</em> answers.
By &#8220;consistent&#8221; I mean in the sense of information content.
<em>Answers don&#8217;t have to agree, but they must not disagree.</em>
Failures like exhausted memory are modeled as ⊥, which is called &#8220;bottom&#8221; because it is the bottom of the information partial ordering.
It contains no information and so is consistent with every value, disagreeing with no value.
More precisely, values are <em>consistent</em> when they have a shared upper (information) bound, and <em>inconsistent</em> when they don&#8217;t.
The value ⊥ means <em>i-don&#8217;t-know</em>, and the value <code>(1,⊥,3)</code> means (1, <em>i-don&#8217;t-know</em>, 3).
The consistent-value principle accepts possible failures due to finite resources and hardware failure, while rejecting &#8220;linux&#8221; vs &#8220;darwin&#8221; for <code>System.Info.os</code> or False vs True for &#8220;<code>2^32 == (0 ::Int)</code>.
It also accepts <code>System.Info.os :: IO String</code>, which is the type I would have expected, because the semantics of <code>IO String</code> is big enough to accommodate dependence on dynamic conditions.</p>

<p>If you also cherish the principles I mention above, I&#8217;d love to hear from you.</p>
<p><a href="http://conal.net/blog/?flattrss_redirect&amp;id=86&amp;md5=9ff66b6506b3b491599ed696fdd04e2d"><img src="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png" srcset="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@2x.png 2xhttp://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@3x.png 3x" alt="Flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://conal.net/blog/posts/notions-of-purity-in-haskell/feed</wfw:commentRss>
		<slash:comments>55</slash:comments>
		<atom:link rel="payment" title="Flattr this!" href="https://flattr.com/submit/auto?user_id=conal&amp;popout=1&amp;url=http%3A%2F%2Fconal.net%2Fblog%2Fposts%2Fnotions-of-purity-in-haskell&amp;language=en_GB&amp;category=text&amp;title=Notions+of+purity+in+Haskell&amp;description=Lately+I%26%238217%3Bve+been+learning+that+some+programming+principles+I+treasure+are+not+widely+shared+among+my+Haskell+comrades.+Or+at+least+not+widely+among+those+I%26%238217%3Bve+been+hearing+from.+I...&amp;tags=purity%2Creferential+transparency%2Csemantics%2Cblog" type="text/html" />
	</item>
		<item>
		<title>Denotational design with type class morphisms</title>
		<link>http://conal.net/blog/posts/denotational-design-with-type-class-morphisms</link>
		<comments>http://conal.net/blog/posts/denotational-design-with-type-class-morphisms#comments</comments>
		<pubDate>Thu, 19 Feb 2009 02:34:08 +0000</pubDate>
		<dc:creator><![CDATA[Conal]]></dc:creator>
				<category><![CDATA[Functional programming]]></category>
		<category><![CDATA[applicative functor]]></category>
		<category><![CDATA[arrow]]></category>
		<category><![CDATA[associated type]]></category>
		<category><![CDATA[functor]]></category>
		<category><![CDATA[monad]]></category>
		<category><![CDATA[monoid]]></category>
		<category><![CDATA[paper]]></category>
		<category><![CDATA[semantics]]></category>
		<category><![CDATA[trie]]></category>
		<category><![CDATA[type class morphism]]></category>

		<guid isPermaLink="false">http://conal.net/blog/?p=84</guid>
		<description><![CDATA[I&#8217;ve just finished a draft of a paper called Denotational design with type class morphisms, for submission to ICFP 2009. The paper is on a theme I&#8217;ve explored in several posts, which is semantics-based design, guided by type class morphisms. I&#8217;d love to get some readings and feedback. Pointers to related work would be particularly [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- 

Title: Denotational design with type class morphisms

Tags: paper, semantics, type class morphism, monoid, functor, applicative functor, monad, arrow, associated type, trie

URL: http://conal.net/blog/posts/denotational-design-with-type-class-morphisms/

-->

<!-- references -->

<!-- teaser -->

<p>I&#8217;ve just finished a draft of a paper called <em><a href="http://conal.net/papers/type-class-morphisms" title="paper">Denotational design with type class morphisms</a></em>, for submission to <a href="http://www.cs.nott.ac.uk/~gmh/icfp09.html" title="conference page">ICFP 2009</a>.
The paper is on a theme I&#8217;ve explored in <a href="http://conal.net/blog/tag/type-class-morphism/">several posts</a>, which is semantics-based design, guided by type class morphisms.</p>

<p>I&#8217;d love to get some readings and feedback.
Pointers to related work would be particularly appreciated, as well as what&#8217;s unclear and what could be cut.
It&#8217;s an entire page over the limit, so I&#8217;ll have to do some trimming before submitting.</p>

<p>The abstract:</p>

<blockquote>
  <p>Type classes provide a mechanism for varied implementations of standard
  interfaces. Many of these interfaces are founded in mathematical
  tradition and so have regularity not only of <em>types</em> but also of
  <em>properties</em> (laws) that must hold. Types and properties give strong
  guidance to the library implementor, while leaving freedom as well. Some
  of the remaining freedom is in <em>how</em> the implementation works, and some
  is in <em>what</em> it accomplishes.</p>
  
  <p>To give additional guidance to the <em>what</em>, without impinging on the
  <em>how</em>, this paper proposes a principle of <em>type class morphisms</em> (TCMs),
  which further refines the compositional style of denotational
  semantics. The TCM idea is simply that <em>the instance&#8217;s meaning is the
  meaning&#8217;s instance</em>. This principle determines the meaning of each type
  class instance, and hence defines correctness of implementation. In some
  cases, it also provides a systematic guide to implementation, and in
  some cases, valuable design feedback.</p>
  
  <p>The paper is illustrated with several examples of type, meanings, and
  morphisms.</p>
</blockquote>

<p>You can <a href="http://conal.net/papers/type-class-morphisms" title="paper">get the paper and see current errata here</a>.</p>

<p>The submission deadline is March 2, so comments before then are most helpful to me.</p>

<p>Enjoy, and thanks!</p>

<!--
**Edits**:

* 2009-02-09: just fiddling around
-->
<p><a href="http://conal.net/blog/?flattrss_redirect&amp;id=84&amp;md5=8ce3b83d01ccfad97ade1469b72d2a04"><img src="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png" srcset="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@2x.png 2xhttp://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@3x.png 3x" alt="Flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://conal.net/blog/posts/denotational-design-with-type-class-morphisms/feed</wfw:commentRss>
		<slash:comments>8</slash:comments>
		<atom:link rel="payment" title="Flattr this!" href="https://flattr.com/submit/auto?user_id=conal&amp;popout=1&amp;url=http%3A%2F%2Fconal.net%2Fblog%2Fposts%2Fdenotational-design-with-type-class-morphisms&amp;language=en_GB&amp;category=text&amp;title=Denotational+design+with+type+class+morphisms&amp;description=I%26%238217%3Bve+just+finished+a+draft+of+a+paper+called+Denotational+design+with+type+class+morphisms%2C+for+submission+to+ICFP+2009.+The+paper+is+on+a+theme+I%26%238217%3Bve+explored+in+several...&amp;tags=applicative+functor%2Carrow%2Cassociated+type%2Cfunctor%2Cmonad%2Cmonoid%2Cpaper%2Csemantics%2Ctrie%2Ctype+class+morphism%2Cblog" type="text/html" />
	</item>
		<item>
		<title>What is automatic differentiation, and why does it work?</title>
		<link>http://conal.net/blog/posts/what-is-automatic-differentiation-and-why-does-it-work</link>
		<comments>http://conal.net/blog/posts/what-is-automatic-differentiation-and-why-does-it-work#comments</comments>
		<pubDate>Wed, 28 Jan 2009 20:09:42 +0000</pubDate>
		<dc:creator><![CDATA[Conal]]></dc:creator>
				<category><![CDATA[Functional programming]]></category>
		<category><![CDATA[derivative]]></category>
		<category><![CDATA[semantics]]></category>
		<category><![CDATA[type class morphism]]></category>

		<guid isPermaLink="false">http://conal.net/blog/?p=79</guid>
		<description><![CDATA[Bertrand Russell remarked that Everything is vague to a degree you do not realize till you have tried to make it precise. I&#8217;m mulling over automatic differentiation (AD) again, neatening up previous posts on derivatives and on linear maps, working them into a coherent whole for an ICFP submission. I understand the mechanics and some [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- teaser -->

<p
>Bertrand Russell remarked that</p
>

<blockquote
><p
  ><em
    >Everything is vague to a degree you do not realize till you have tried to make it precise.</em
    ></p
  ></blockquote
>

<p
>I&#8217;m mulling over automatic differentiation (AD) again, neatening up previous posts on <a href="http://conal.net/blog/tag/derivative/" title="posts on derivatives"
  >derivatives</a
  > and on <a href="http://conal.net/blog/tag/linear-map/" title="posts on linear maps"
  >linear maps</a
  >, working them into a coherent whole for an ICFP submission. I understand the mechanics and some of the reasons for its correctness. After all, it&#8217;s &quot;just the chain rule&quot;.</p
>

<p
>As usual, in the process of writing, I bumped up against Russell&#8217;s principle. I felt a growing uneasiness and realized that I didn&#8217;t understand AD in the way I like to understand software, namely,</p
>

<ul
><li
  ><em
    >What</em
    > does it mean, independently of implementation?</li
  ><li
  ><em
    >How</em
    > do the implementation and its correctness flow gracefully from that meaning?</li
  ><li
  ><em
    >Where</em
    > else might we go, guided by answers to the first two questions?</li
  ></ul
>

<p
>Ever since writing <em
  ><a href="http://conal.net/papers/simply-reactive" title="paper"
    >Simply efficient functional reactivity</a
    ></em
  >, the idea of <a href="http://conal.net/blog/tag/type-class-morphism/" title="posts on type class morphisms"
  >type class morphisms</a
  > keeps popping up for me as a framework in which to ask and answer these questions. To my delight, this framework gives me new and more satisfying insight into automatic differentiation.</p
>

<p><span id="more-79"></span></p>

<div id="whats-a-derivative"
><h3
  >What&#8217;s a derivative?</h3
  ><p
  >My first guess is that AD has something to do with derivatives, which then raises the question of what is a derivative. For now, I&#8217;m going to substitute a popular but problematic answer to that question and say that</p
  ><pre class="sourceCode haskell"
  ><code
    >deriv <span class="dv"
      >&#8759;</span
      > &#8943; &#8658; (a &#8594; b) &#8594; (a &#8594; b) <span class="co"
      >--  simplification</span
      ><br
       /></code
    ></pre
  ><p
  >As discussed in <em
    ><a href="http://conal.net/blog/posts/what-is-a-derivative-really/" title="blog post"
      >What is a derivative, really?</a
      ></em
    >, the popular answer has limited usefulness, applying just to scalar (one-dimensional) domain. The real deal involves distinguishing the type <code
    >b</code
    > from the type <code
    >a :-* b</code
    > of <a href="http://conal.net/blog/tag/linear-map/" title="posts on linear maps"
    >linear maps</a
    > from <code
    >a</code
    > to <code
    >b</code
    >.</p
  ><pre class="sourceCode haskell"
  ><code
    >deriv <span class="dv"
      >&#8759;</span
      > (<span class="dt"
      >VectorSpace</span
      > u, <span class="dt"
      >VectorSpace</span
      > v) &#8658; (u &#8594; v) &#8594; (u &#8594; (u <span class="fu"
      >:-*</span
      > v))<br
       /></code
    ></pre
  ><div id="why-care-about-derivatives"
  ><h4
    >Why care about derivatives?</h4
    ><p
    >Derivatives are useful in a variety of application areas, including root-finding, optimization, curve and surface tessellation, and computation of surface normals for 3D rendering. Considering the usefulness of derivatives, it is worthwhile to find software methods that are</p
    ><ul
    ><li
      >simple (to implement and verify),</li
      ><li
      >convenient,</li
      ><li
      >accurate,</li
      ><li
      >efficient, and</li
      ><li
      >general.</li
      ></ul
    ></div
  ></div
>

<div id="what-isnt-ad"
><h3
  >What <em
    >isn't</em
    > AD?</h3
  ><div id="numeric-approximation"
  ><h4
    >Numeric approximation</h4
    ><p
    >One differentiation method <em
      >numeric approximation</em
      >, using simple finite differences. This method is based on the definition of (scalar) derivative:</p
    ><div class=math-inset>
<p
    ><span class="math"
      ><em
    >d</em
    ><em
    >e</em
    ><em
    >r</em
    ><em
    >i</em
    ><em
    >v</em
    > <em
    >f</em
    > <em
    >x</em
    > ≡ lim<sub
    ><em
      >h</em
      > → 0</sub
    >(<em
    >f</em
    > (<em
    >x</em
    > + <em
    >h</em
    >) - <em
    >f</em
    > <em
    >x</em
    >) / <em
    >h</em
    ></span
      ></p
    ></div>
<p
    >The left-hand side reads &quot;the derivative of <em
      >f</em
      > at <em
      >x</em
      >&quot;.</p
    ><p
    >To approximate the derivative, use</p
    ><div class=math-inset>
<p
    ><span class="math"
      ><em
    >d</em
    ><em
    >e</em
    ><em
    >r</em
    ><em
    >i</em
    ><em
    >v</em
    > <em
    >f</em
    > <em
    >x</em
    > ≈ (<em
    >f</em
    > (<em
    >x</em
    > + <em
    >h</em
    >) - <em
    >f</em
    > <em
    >x</em
    >) / <em
    >h</em
    ></span
      ></p
    ></div>
<p
    >for a small value of <em
      >h</em
      >. While very simple, this method is often inaccurate, due to choosing either too large or too small a value for <em
      >h</em
      >. (Small values of <em
      >h</em
      > lead to rounding errors.) More sophisticated variations improve accuracy while sacrificing simplicity.</p
    ></div
  ><div id="symbolic-differentiation"
  ><h4
    >Symbolic differentiation</h4
    ><p
    >A second method is <em
      >symbolic differentiation</em
      >. Instead of using the definition of <em
      >deriv</em
      > directly, the symbolic method uses a collection of rules, such as those below:</p
    ><pre class="sourceCode haskell"
    ><code
      >deriv (u <span class="fu"
    >+</span
    > v)   &#8801; deriv u <span class="fu"
    >+</span
    > deriv v<br
     />deriv (u <span class="fu"
    >*</span
    > v)   &#8801; deriv v <span class="fu"
    >*</span
    > u <span class="fu"
    >+</span
    > deriv u <span class="fu"
    >*</span
    > v<br
     />deriv (<span class="fu"
    >-</span
    > u)     &#8801; <span class="fu"
    >-</span
    > deriv u<br
     />deriv (<span class="fu"
    >exp</span
    > u)   &#8801; deriv u <span class="fu"
    >*</span
    > <span class="fu"
    >exp</span
    > u<br
     />deriv (<span class="fu"
    >log</span
    > u)   &#8801; deriv u <span class="fu"
    >/</span
    > u<br
     />deriv (<span class="fu"
    >sqrt</span
    > u)  &#8801; deriv u <span class="fu"
    >/</span
    > (<span class="dv"
    >2</span
    > <span class="fu"
    >*</span
    > <span class="fu"
    >sqrt</span
    > u)<br
     />deriv (<span class="fu"
    >sin</span
    > u)   &#8801; deriv u <span class="fu"
    >*</span
    > <span class="fu"
    >cos</span
    > u<br
     />deriv (<span class="fu"
    >cos</span
    > u)   &#8801; deriv u <span class="fu"
    >*</span
    > (<span class="fu"
    >-</span
    > <span class="fu"
    >sin</span
    > u)<br
     />deriv (<span class="fu"
    >asin</span
    > u)  &#8801; deriv u<span class="fu"
    >/</span
    >(<span class="fu"
    >sqrt</span
    > (<span class="dv"
    >1</span
    > <span class="fu"
    >-</span
    > u<span class="fu"
    >^</span
    ><span class="dv"
    >2</span
    >))<br
     />deriv (<span class="fu"
    >acos</span
    > u)  &#8801; <span class="fu"
    >-</span
    > deriv u<span class="fu"
    >/</span
    >(<span class="fu"
    >sqrt</span
    > (<span class="dv"
    >1</span
    > <span class="fu"
    >-</span
    > u<span class="fu"
    >^</span
    ><span class="dv"
    >2</span
    >))<br
     />deriv (<span class="fu"
    >atan</span
    > u)  &#8801; deriv u <span class="fu"
    >/</span
    > (u<span class="fu"
    >^</span
    ><span class="dv"
    >2</span
    > <span class="fu"
    >+</span
    > <span class="dv"
    >1</span
    >)<br
     />deriv (<span class="fu"
    >sinh</span
    > u)  &#8801; deriv u <span class="fu"
    >*</span
    > <span class="fu"
    >cosh</span
    > u<br
     />deriv (<span class="fu"
    >cosh</span
    > u)  &#8801; deriv u <span class="fu"
    >*</span
    > <span class="fu"
    >sinh</span
    > u<br
     />deriv (<span class="fu"
    >asinh</span
    > u) &#8801; deriv u <span class="fu"
    >/</span
    > (<span class="fu"
    >sqrt</span
    > (u<span class="fu"
    >^</span
    ><span class="dv"
    >2</span
    > <span class="fu"
    >+</span
    > <span class="dv"
    >1</span
    >))<br
     />deriv (<span class="fu"
    >acosh</span
    > u) &#8801; <span class="fu"
    >-</span
    > deriv u <span class="fu"
    >/</span
    > (<span class="fu"
    >sqrt</span
    > (u<span class="fu"
    >^</span
    ><span class="dv"
    >2</span
    > <span class="fu"
    >-</span
    > <span class="dv"
    >1</span
    >))<br
     />deriv (<span class="fu"
    >atanh</span
    > u) &#8801; deriv u <span class="fu"
    >/</span
    > (<span class="dv"
    >1</span
    > <span class="fu"
    >-</span
    > u<span class="fu"
    >^</span
    ><span class="dv"
    >2</span
    >)<br
     /></code
      ></pre
    ><p
    >There are two main drawbacks to the symbolic approach to differentiation.</p
    ><ul
    ><li
      >As a symbolic method, it requires access to and transformation of source code, and placing restrictions on that source code.</li
      ><li
      >Implementations tend to be quite expensive and in particular perform redundant computation. (I wonder if this latter criticism is a straw man argument. Are symbolic methods <em
    >necessarily</em
    > expensive or just when implemented naïvely? For instance, can simply memoized symbolic differentiation be nearly as cheap as AD?)</li
      ></ul
    ></div
  ></div
>

<div id="what-is-ad-and-how-does-it-work"
><h3
  >What is AD and how does it work?</h3
  ><p
  >A third method is the topic of this post, namely <em
    >automatic differentiation</em
    > (also called &quot;algorithmic differentiation&quot;), or &quot;AD&quot;. The idea of AD is to simultaneously manipulate values and derivatives. Overloading of the standard numerical operations (and literals) makes this combined manipulation as convenient and elegant as manipulating values without derivatives.</p
  ><p
  >The implementation of AD can be quite simple, as shown below:</p
  ><pre class="sourceCode haskell"
  ><code
    ><span class="kw"
      >data</span
      > <span class="dt"
      >D</span
      > a <span class="fu"
      >=</span
      > <span class="dt"
      >D</span
      > a a <span class="kw"
      >deriving</span
      > (<span class="kw"
      >Eq</span
      >,<span class="kw"
      >Show</span
      >)<br
       /><br
       /><span class="kw"
      >instance</span
      > <span class="kw"
      >Num</span
      > a &#8658; <span class="kw"
      >Num</span
      > (<span class="dt"
      >D</span
      > a) <span class="kw"
      >where</span
      ><br
       />  <span class="dt"
      >D</span
      > x x' <span class="fu"
      >+</span
      > <span class="dt"
      >D</span
      > y y' <span class="fu"
      >=</span
      > <span class="dt"
      >D</span
      > (x<span class="fu"
      >+</span
      >y) (x'<span class="fu"
      >+</span
      >y')<br
       />  <span class="dt"
      >D</span
      > x x' <span class="fu"
      >*</span
      > <span class="dt"
      >D</span
      > y y' <span class="fu"
      >=</span
      > <span class="dt"
      >D</span
      > (x<span class="fu"
      >*</span
      >y) (y'<span class="fu"
      >*</span
      >x <span class="fu"
      >+</span
      > x'<span class="fu"
      >*</span
      >y)<br
       />  <span class="fu"
      >fromInteger</span
      > x   <span class="fu"
      >=</span
      > <span class="dt"
      >D</span
      > (<span class="fu"
      >fromInteger</span
      > x) <span class="dv"
      >0</span
      ><br
       />  <span class="fu"
      >negate</span
      > (<span class="dt"
      >D</span
      > x x') <span class="fu"
      >=</span
      > <span class="dt"
      >D</span
      > (<span class="fu"
      >negate</span
      > x) (<span class="fu"
      >negate</span
      > x')<br
       />  <span class="fu"
      >signum</span
      > (<span class="dt"
      >D</span
      > x _ ) <span class="fu"
      >=</span
      > <span class="dt"
      >D</span
      > (<span class="fu"
      >signum</span
      > x) <span class="dv"
      >0</span
      ><br
       />  <span class="fu"
      >abs</span
      >    (<span class="dt"
      >D</span
      > x x') <span class="fu"
      >=</span
      > <span class="dt"
      >D</span
      > (<span class="fu"
      >abs</span
      > x) (x' <span class="fu"
      >*</span
      > <span class="fu"
      >signum</span
      > x)<br
       /><br
       /><span class="kw"
      >instance</span
      > <span class="kw"
      >Fractional</span
      > x &#8658; <span class="kw"
      >Fractional</span
      > (<span class="dt"
      >D</span
      > x) <span class="kw"
      >where</span
      ><br
       />  <span class="fu"
      >fromRational</span
      > x  <span class="fu"
      >=</span
      > <span class="dt"
      >D</span
      > (<span class="fu"
      >fromRational</span
      > x) <span class="dv"
      >0</span
      ><br
       />  <span class="fu"
      >recip</span
      >  (<span class="dt"
      >D</span
      > x x') <span class="fu"
      >=</span
      > <span class="dt"
      >D</span
      > (<span class="fu"
      >recip</span
      > x) (- x' <span class="fu"
      >/</span
      > sqr x)<br
       /><br
       />sqr <span class="dv"
      >&#8759;</span
      > <span class="kw"
      >Num</span
      > a &#8658; a &#8594; a<br
       />sqr x <span class="fu"
      >=</span
      > x <span class="fu"
      >*</span
      > x<br
       /><br
       /><span class="kw"
      >instance</span
      > <span class="kw"
      >Floating</span
      > x &#8658; <span class="kw"
      >Floating</span
      > (<span class="dt"
      >D</span
      > x) <span class="kw"
      >where</span
      ><br
       />  &#960;              <span class="fu"
      >=</span
      > <span class="dt"
      >D</span
      > &#960; <span class="dv"
      >0</span
      ><br
       />  <span class="fu"
      >exp</span
      >    (<span class="dt"
      >D</span
      > x x') <span class="fu"
      >=</span
      > <span class="dt"
      >D</span
      > (<span class="fu"
      >exp</span
      >    x) (x' <span class="fu"
      >*</span
      > <span class="fu"
      >exp</span
      > x)<br
       />  <span class="fu"
      >log</span
      >    (<span class="dt"
      >D</span
      > x x') <span class="fu"
      >=</span
      > <span class="dt"
      >D</span
      > (<span class="fu"
      >log</span
      >    x) (x' <span class="fu"
      >/</span
      > x)<br
       />  <span class="fu"
      >sqrt</span
      >   (<span class="dt"
      >D</span
      > x x') <span class="fu"
      >=</span
      > <span class="dt"
      >D</span
      > (<span class="fu"
      >sqrt</span
      >   x) (x' <span class="fu"
      >/</span
      > (<span class="dv"
      >2</span
      > <span class="fu"
      >*</span
      > <span class="fu"
      >sqrt</span
      > x))<br
       />  <span class="fu"
      >sin</span
      >    (<span class="dt"
      >D</span
      > x x') <span class="fu"
      >=</span
      > <span class="dt"
      >D</span
      > (<span class="fu"
      >sin</span
      >    x) (x' <span class="fu"
      >*</span
      > <span class="fu"
      >cos</span
      > x)<br
       />  <span class="fu"
      >cos</span
      >    (<span class="dt"
      >D</span
      > x x') <span class="fu"
      >=</span
      > <span class="dt"
      >D</span
      > (<span class="fu"
      >cos</span
      >    x) (x' <span class="fu"
      >*</span
      > (<span class="fu"
      >-</span
      > <span class="fu"
      >sin</span
      > x))<br
       />  <span class="fu"
      >asin</span
      >   (<span class="dt"
      >D</span
      > x x') <span class="fu"
      >=</span
      > <span class="dt"
      >D</span
      > (<span class="fu"
      >asin</span
      >   x) (x' <span class="fu"
      >/</span
      > <span class="fu"
      >sqrt</span
      > (<span class="dv"
      >1</span
      > <span class="fu"
      >-</span
      > sqr x))<br
       />  <span class="fu"
      >acos</span
      >   (<span class="dt"
      >D</span
      > x x') <span class="fu"
      >=</span
      > <span class="dt"
      >D</span
      > (<span class="fu"
      >acos</span
      >   x) (x' <span class="fu"
      >/</span
      > (<span class="fu"
      >-</span
      >  <span class="fu"
      >sqrt</span
      > (<span class="dv"
      >1</span
      > <span class="fu"
      >-</span
      > sqr x)))<br
       />  <span class="co"
      >-- &#8943;</span
      ><br
       /></code
    ></pre
  ><p
  >As an example, define</p
  ><pre class="sourceCode haskell"
  ><code
    >f1 <span class="dv"
      >&#8759;</span
      > <span class="kw"
      >Floating</span
      > a &#8658; a &#8594; a<br
       />f1 z <span class="fu"
      >=</span
      > <span class="fu"
      >sqrt</span
      > (<span class="dv"
      >3</span
      > <span class="fu"
      >*</span
      > <span class="fu"
      >sin</span
      > z)<br
       /></code
    ></pre
  ><p
  >and try it out in GHCi:</p
  ><pre class="sourceCode haskell"
  ><code
    ><span class="fu"
      >*</span
      ><span class="dt"
      >Main</span
      ><span class="fu"
      >&gt;</span
      > f1 (<span class="dt"
      >D</span
      > <span class="dv"
      >2</span
      > <span class="dv"
      >1</span
      >)<br
       /><span class="dt"
      >D</span
      > <span class="dv"
      >1</span
      ><span class="fu"
      >.</span
      ><span class="dv"
      >6516332160855343</span
      > (<span class="fu"
      >-</span
      ><span class="dv"
      >0</span
      ><span class="fu"
      >.</span
      ><span class="dv"
      >3779412091869595</span
      >)<br
       /></code
    ></pre
  ><p
  >To test correctness, here is a symbolically differentiated version:</p
  ><pre class="sourceCode haskell"
  ><code
    >f2 <span class="dv"
      >&#8759;</span
      > <span class="kw"
      >Floating</span
      > a &#8658; a &#8594; <span class="dt"
      >D</span
      > a<br
       />f2 x <span class="fu"
      >=</span
      > <span class="dt"
      >D</span
      > (f1 x) (<span class="dv"
      >3</span
      > <span class="fu"
      >*</span
      > <span class="fu"
      >cos</span
      > x <span class="fu"
      >/</span
      > (<span class="dv"
      >2</span
      > <span class="fu"
      >*</span
      > <span class="fu"
      >sqrt</span
      > (<span class="dv"
      >3</span
      > <span class="fu"
      >*</span
      > <span class="fu"
      >sin</span
      > x)))<br
       /></code
    ></pre
  ><p
  >Try it out:</p
  ><pre class="sourceCode haskell"
  ><code
    ><span class="fu"
      >*</span
      ><span class="dt"
      >Main</span
      ><span class="fu"
      >&gt;</span
      > f2 <span class="dv"
      >2</span
      ><br
       /><span class="dt"
      >D</span
      > <span class="dv"
      >1</span
      ><span class="fu"
      >.</span
      ><span class="dv"
      >6516332160855343</span
      > (<span class="fu"
      >-</span
      ><span class="dv"
      >0</span
      ><span class="fu"
      >.</span
      ><span class="dv"
      >3779412091869595</span
      >)<br
       /></code
    ></pre
  ><p
  >The can also be made prettier, as in <em
    ><a href="http://conal.net/blog/posts/beautiful-differentiation/" title="blog post"
      >Beautiful differentiation</a
      ></em
    >. Add an operator that captures the chain rule, which is behind the differentiation laws listed above.</p
  ><pre class="sourceCode haskell"
  ><code
    >infix  <span class="dv"
      >0</span
      > <span class="fu"
      >&gt;-&lt;</span
      ><br
       />(<span class="fu"
      >&gt;-&lt;</span
      >) <span class="dv"
      >&#8759;</span
      > <span class="kw"
      >Num</span
      > a &#8658; (a &#8594; a) &#8594; (a &#8594; a) &#8594; (<span class="dt"
      >D</span
      > a &#8594; <span class="dt"
      >D</span
      > a)<br
       />(f <span class="fu"
      >&gt;-&lt;</span
      > f') (<span class="dt"
      >D</span
      > a a') <span class="fu"
      >=</span
      > <span class="dt"
      >D</span
      > (f a) (a' <span class="fu"
      >*</span
      > f' a)<br
       /></code
    ></pre
  ><p
  >Then, e.g.,</p
  ><pre class="sourceCode haskell"
  ><code
    ><span class="kw"
      >instance</span
      > <span class="kw"
      >Floating</span
      > a &#8658; <span class="kw"
      >Floating</span
      > (<span class="dt"
      >D</span
      > a) <span class="kw"
      >where</span
      ><br
       />  &#960;   <span class="fu"
      >=</span
      > <span class="dt"
      >D</span
      > &#960; <span class="dv"
      >0</span
      ><br
       />  <span class="fu"
      >exp</span
      >  <span class="fu"
      >=</span
      > <span class="fu"
      >exp</span
      >  <span class="fu"
      >&gt;-&lt;</span
      > <span class="fu"
      >exp</span
      ><br
       />  <span class="fu"
      >log</span
      >  <span class="fu"
      >=</span
      > <span class="fu"
      >log</span
      >  <span class="fu"
      >&gt;-&lt;</span
      > <span class="fu"
      >recip</span
      ><br
       />  <span class="fu"
      >sqrt</span
      > <span class="fu"
      >=</span
      > <span class="fu"
      >sqrt</span
      > <span class="fu"
      >&gt;-&lt;</span
      > <span class="fu"
      >recip</span
      > (<span class="dv"
      >2</span
      > <span class="fu"
      >*</span
      > <span class="fu"
      >sqrt</span
      >)<br
       />  <span class="fu"
      >sin</span
      >  <span class="fu"
      >=</span
      > <span class="fu"
      >sin</span
      >  <span class="fu"
      >&gt;-&lt;</span
      > <span class="fu"
      >cos</span
      ><br
       />  <span class="fu"
      >cos</span
      >  <span class="fu"
      >=</span
      > <span class="fu"
      >cos</span
      >  <span class="fu"
      >&gt;-&lt;</span
      > <span class="fu"
      >-</span
      > <span class="fu"
      >sin</span
      ><br
       />  <span class="fu"
      >asin</span
      > <span class="fu"
      >=</span
      > <span class="fu"
      >asin</span
      > <span class="fu"
      >&gt;-&lt;</span
      > <span class="fu"
      >recip</span
      > (<span class="fu"
      >sqrt</span
      > (<span class="dv"
      >1</span
      ><span class="fu"
      >-</span
      >sqr))<br
       />  <span class="fu"
      >acos</span
      > <span class="fu"
      >=</span
      > <span class="fu"
      >acos</span
      > <span class="fu"
      >&gt;-&lt;</span
      > <span class="fu"
      >recip</span
      > (<span class="fu"
      >-</span
      > <span class="fu"
      >sqrt</span
      > (<span class="dv"
      >1</span
      ><span class="fu"
      >-</span
      >sqr))<br
       />  <span class="co"
      >-- &#8943;</span
      ><br
       /></code
    ></pre
  ><p
  >This AD implementation satisfy most of our criteria very well:</p
  ><ul
  ><li
    >It is simple to implement and verify. Both the implementation and its correctness follow directly from the familiar laws given above.</li
    ><li
    >It is convenient to use, as shown with <code
      >f1</code
      > above.</li
    ><li
    >It is accurate, as shown above, producing <em
      >exactly</em
      > the same result as the symbolic differentiated code (<code
      >f2</code
      >).</li
    ><li
    >It is efficient, involving no iteration or redundant computation.</li
    ></ul
  ><p
  >The formulation above does less well with <em
    >generality</em
    >:</p
  ><ul
  ><li
    >It computes only first derivatives.</li
    ><li
    >It applies (correctly) only to functions over a scalar (one-dimensional) domain, excluding even complex numbers.</li
    ></ul
  ><p
  >Both of these limitations are removed in the post <em
    ><a href="http://conal.net/blog/posts/higher-dimensional-higher-order-derivatives-functionally/" title="blog post"
      >Higher-dimensional, higher-order derivatives, functionally</a
      ></em
    >.</p
  ></div
>

<div id="what-is-ad-really"
><h3
  >What is AD, really?</h3
  ><p
  >How do we know whether this AD implementation is correct? We can't begin to address this question until we first answer a more fundamental one: what does its correctness mean?</p
  ><div id="a-model-for-ad"
  ><h4
    >A model for AD</h4
    ><p
    >I'm pretty sure AD has something to do with calculating a function's values and derivative values simultaneously, so I'll start there.</p
    ><pre class="sourceCode haskell"
    ><code
      >withD <span class="dv"
    >&#8759;</span
    > &#8943; &#8658; (a &#8594; a) &#8594; (a &#8594; <span class="dt"
    >D</span
    > a)<br
     />withD f x <span class="fu"
    >=</span
    > <span class="dt"
    >D</span
    > (f x) (deriv f x)<br
     /></code
      ></pre
    ><p
    >Or, in point-free form,</p
    ><pre class="sourceCode haskell"
    ><code
      >withD f <span class="fu"
    >=</span
    > liftA2 <span class="dt"
    >D</span
    > f (deriv f)<br
     /></code
      ></pre
    ><p
    >Since, on functions,</p
    ><pre class="sourceCode haskell"
    ><code
      >liftA2 h f g <span class="fu"
    >=</span
    > &#955; x &#8594; h (f x) (g x)<br
     /></code
      ></pre
    ><p
    >We don't have an implementation of <code
      >deriv</code
      >, so this definition of <code
      >withD</code
      > will serve as a specification, not an implementation.</p
    ><p
    >If AD is structured as type class instances, then I'd want there to be a compelling interpretation function that is faithful to each of those classes, as in the principle of <a href="http://conal.net/blog/tag/type-class-morphism/" title="posts on type class morphisms"
      >type class morphisms</a
      >, which is to say that the interpretation of each method corresponds to the same method for the interpretation.</p
    ><p
    >For AD, the interpretation function is <code
      >withD</code
      >. It's turned around this time (mapping <em
      >to</em
      > instead of <em
      >from</em
      > our type), as is sometimes the case. The <code
      >Num</code
      >, <code
      >Fractional</code
      >, and <code
      >Floating</code
      > morphisms provide the specifications of the instances:</p
    ><pre class="sourceCode haskell"
    ><code
      >withD (u <span class="fu"
    >+</span
    > v) &#8801; withD u <span class="fu"
    >+</span
    > withD v<br
     />withD (u <span class="fu"
    >*</span
    > v) &#8801; withD u <span class="fu"
    >*</span
    > withD v<br
     />withD (<span class="fu"
    >sin</span
    > u) &#8801; <span class="fu"
    >sin</span
    > (withD u)<br
     />&#8943;<br
     /></code
      ></pre
    ><p
    >Note here that the methods on the left are on <code
      >a &#8594; a</code
      >, and on the right are on <code
      >a &#8594; D a</code
      >.</p
    ><p
    >These (morphism) properties exactly define correctness of any implementation of AD, answering my first question:</p
    ><blockquote
    ><p
      ><em
    >What</em
    > does it mean, independently of implementation?</p
      ></blockquote
    ></div
  ></div
>

<div id="deriving-an-ad-implementation"
><h3
  >Deriving an AD implementation</h3
  ><p
  >Now that we have a simple, formal specification of AD (numeric type class morphisms), we can try to prove that the implementation above satisfies the specification. Better yet, let's do the reverse, and use the morphism properties to <em
    >discover</em
    > the implementation, and prove it correct in the process.</p
  ><div id="addition"
  ><h4
    >Addition</h4
    ><p
    >Here is the addition specification:</p
    ><pre class="sourceCode haskell"
    ><code
      >withD (u <span class="fu"
    >+</span
    > v) &#8801; withD u <span class="fu"
    >+</span
    > withD v<br
     /></code
      ></pre
    ><p
    >Start with the left-hand side:</p
    ><pre class="sourceCode haskell"
    ><code
      >   withD (u <span class="fu"
    >+</span
    > v)<br
     />&#8801;   <span class="co"
    >{- def of withD -}</span
    ><br
     />   liftA2 <span class="dt"
    >D</span
    > (u <span class="fu"
    >+</span
    > v) (deriv (u <span class="fu"
    >+</span
    > v))<br
     />&#8801;   <span class="co"
    >{- deriv rule for (+) -}</span
    ><br
     />   liftA2 <span class="dt"
    >D</span
    > (u <span class="fu"
    >+</span
    > v) (deriv u <span class="fu"
    >+</span
    > deriv v)<br
     />&#8801;   <span class="co"
    >{- liftA2 on functions -}</span
    ><br
     />   &#955; x &#8594; <span class="dt"
    >D</span
    > ((u <span class="fu"
    >+</span
    > v) x) ((deriv u <span class="fu"
    >+</span
    > deriv v) x)<br
     />&#8801;   <span class="co"
    >{- (+) on functions -}</span
    ><br
     />   &#955; x &#8594; <span class="dt"
    >D</span
    > (u x <span class="fu"
    >+</span
    > v x) (deriv u x <span class="fu"
    >+</span
    > deriv v x)<br
     /></code
      ></pre
    ><p
    >Then start over with the right-hand side:</p
    ><pre class="sourceCode haskell"
    ><code
      >   withD u <span class="fu"
    >+</span
    > withD v<br
     />&#8801;   <span class="co"
    >{- (+) on functions -}</span
    ><br
     />   &#955; x &#8594; withD u x <span class="fu"
    >+</span
    > withD v x<br
     />&#8801;   <span class="co"
    >{- def of withD -}</span
    ><br
     />   &#955; x &#8594; <span class="dt"
    >D</span
    > (u x) (deriv u x) <span class="fu"
    >+</span
    > <span class="dt"
    >D</span
    > (v x) (deriv v x)<br
     /></code
      ></pre
    ><p
    >We need a definition of <code
      >(+)</code
      > on <code
      >D</code
      > that makes these two final forms equal, i.e.,</p
    ><pre class="sourceCode haskell"
    ><code
      >   &#955; x &#8594; <span class="dt"
    >D</span
    > (u x <span class="fu"
    >+</span
    > v x) (deriv u x <span class="fu"
    >+</span
    > deriv v x)<br
     />&#8801;<br
     />   &#955; x &#8594; <span class="dt"
    >D</span
    > (u x) (deriv u x) <span class="fu"
    >+</span
    > <span class="dt"
    >D</span
    > (v x) (deriv v x)<br
     /></code
      ></pre
    ><p
    >An easy choice is</p
    ><pre class="sourceCode haskell"
    ><code
      ><span class="dt"
    >D</span
    > a a' <span class="fu"
    >+</span
    > <span class="dt"
    >D</span
    > b b' <span class="fu"
    >=</span
    > <span class="dt"
    >D</span
    > (a <span class="fu"
    >+</span
    > b) (a' <span class="fu"
    >+</span
    > b')<br
     /></code
      ></pre
    ><p
    >This definition provides the missing link and that completes the proof that</p
    ><pre class="sourceCode haskell"
    ><code
      >withD (u <span class="fu"
    >+</span
    > v) &#8801; withD u <span class="fu"
    >+</span
    > withD v<br
     /></code
      ></pre
    ></div
  ><div id="multiplication"
  ><h4
    >Multiplication</h4
    ><p
    >The specification:</p
    ><pre class="sourceCode haskell"
    ><code
      >withD (u <span class="fu"
    >*</span
    > v) &#8801; withD u <span class="fu"
    >*</span
    > withD v<br
     /></code
      ></pre
    ><p
    >Reason similarly to the addition case. Begin with the left hand side:</p
    ><pre class="sourceCode haskell"
    ><code
      >   withD (u <span class="fu"
    >*</span
    > v)<br
     />&#8801;   <span class="co"
    >{- def of withD -}</span
    ><br
     />   liftA2 <span class="dt"
    >D</span
    > (u <span class="fu"
    >*</span
    > v) (deriv (u <span class="fu"
    >*</span
    > v))<br
     />&#8801;   <span class="co"
    >{- deriv rule for (*) -}</span
    ><br
     />   liftA2 <span class="dt"
    >D</span
    > (u <span class="fu"
    >*</span
    > v) (deriv u <span class="fu"
    >*</span
    > v <span class="fu"
    >+</span
    > deriv v <span class="fu"
    >*</span
    > u)<br
     />&#8801;   <span class="co"
    >{- liftA2 on functions -}</span
    ><br
     />   &#955; x &#8594; <span class="dt"
    >D</span
    > ((u <span class="fu"
    >*</span
    > v) x) ((deriv u <span class="fu"
    >*</span
    > v <span class="fu"
    >+</span
    > deriv v <span class="fu"
    >*</span
    > u) x)<br
     />&#8801;   <span class="co"
    >{- (*) and (+) on functions -}</span
    ><br
     />   &#955; x &#8594; <span class="dt"
    >D</span
    > (u x <span class="fu"
    >*</span
    > v x) (deriv u x <span class="fu"
    >*</span
    > v x <span class="fu"
    >+</span
    > <span class="fu"
    >*</span
    > deriv v x <span class="fu"
    >*</span
    > u x)<br
     /></code
      ></pre
    ><p
    >Then start over with the right-hand side:</p
    ><pre class="sourceCode haskell"
    ><code
      >   withD u <span class="fu"
    >*</span
    > withD v<br
     />&#8801;   <span class="co"
    >{- (*) on functions -}</span
    ><br
     />   &#955; x &#8594; withD u x <span class="fu"
    >*</span
    > withD v x<br
     />&#8801;   <span class="co"
    >{- def of withD -}</span
    ><br
     />   &#955; x &#8594; <span class="dt"
    >D</span
    > (u x) (deriv u x) <span class="fu"
    >*</span
    > <span class="dt"
    >D</span
    > (v x) (deriv v x)<br
     /></code
      ></pre
    ><p
    >Sufficient definition:</p
    ><pre class="sourceCode haskell"
    ><code
      ><span class="dt"
    >D</span
    > a a' <span class="fu"
    >*</span
    > <span class="dt"
    >D</span
    > b b' <span class="fu"
    >=</span
    > <span class="dt"
    >D</span
    > (a <span class="fu"
    >+</span
    > b) (a' <span class="fu"
    >*</span
    > b <span class="fu"
    >+</span
    > b' <span class="fu"
    >*</span
    > a)<br
     /></code
      ></pre
    ></div
  ><div id="sine"
  ><h4
    >Sine</h4
    ><p
    >Specification:</p
    ><pre class="sourceCode haskell"
    ><code
      >withD (<span class="fu"
    >sin</span
    > u) &#8801; <span class="fu"
    >sin</span
    > (withD u)<br
     /></code
      ></pre
    ><p
    >Begin with the left hand side:</p
    ><pre class="sourceCode haskell"
    ><code
      >   withD (<span class="fu"
    >sin</span
    > u)<br
     />&#8801;   <span class="co"
    >{- def of withD -}</span
    ><br
     />   liftA2 <span class="dt"
    >D</span
    > (<span class="fu"
    >sin</span
    > u) (deriv (<span class="fu"
    >sin</span
    > u))<br
     />&#8801;   <span class="co"
    >{- deriv rule for sin -}</span
    ><br
     />   liftA2 <span class="dt"
    >D</span
    > (<span class="fu"
    >sin</span
    > u) (deriv u <span class="fu"
    >*</span
    > <span class="fu"
    >cos</span
    > u)<br
     />&#8801;   <span class="co"
    >{- liftA2 on functions -}</span
    ><br
     />   &#955; x &#8594; <span class="dt"
    >D</span
    > ((<span class="fu"
    >sin</span
    > u) x) ((deriv u <span class="fu"
    >*</span
    > <span class="fu"
    >cos</span
    > u) x)<br
     />&#8801;   <span class="co"
    >{- sin, (*) and cos on functions -}</span
    ><br
     />   &#955; x &#8594; <span class="dt"
    >D</span
    > (<span class="fu"
    >sin</span
    > (u x)) (deriv u x <span class="fu"
    >*</span
    > <span class="fu"
    >cos</span
    > (u x))<br
     /></code
      ></pre
    ><p
    >Then start over with the right-hand side:</p
    ><pre class="sourceCode haskell"
    ><code
      >   <span class="fu"
    >sin</span
    > (withD u)<br
     />&#8801;   <span class="co"
    >{- sin on functions -}</span
    ><br
     />   &#955; x &#8594; <span class="fu"
    >sin</span
    > (withD u x)<br
     />&#8801;   <span class="co"
    >{- def of withD -}</span
    ><br
     />   &#955; x &#8594; <span class="fu"
    >sin</span
    > (<span class="dt"
    >D</span
    > (u x) (deriv u x))<br
     /></code
      ></pre
    ><p
    >Sufficient definition:</p
    ><pre class="sourceCode haskell"
    ><code
      ><span class="fu"
    >sin</span
    > (<span class="dt"
    >D</span
    > a a') <span class="fu"
    >=</span
    > <span class="dt"
    >D</span
    > (<span class="fu"
    >sin</span
    > a) (a' <span class="fu"
    >*</span
    > <span class="fu"
    >cos</span
    > a)<br
     /></code
      ></pre
    ><p
    >Or, using the chain rule operator,</p
    ><pre class="sourceCode haskell"
    ><code
      ><span class="fu"
    >sin</span
    > <span class="fu"
    >=</span
    > <span class="fu"
    >sin</span
    > <span class="fu"
    >&gt;-&lt;</span
    > <span class="fu"
    >cos</span
    ><br
     /></code
      ></pre
    ><p
    >The whole implementation can be derived in exactly this style, answering my second question:</p
    ><blockquote
    ><p
      ><em
    >How</em
    > does the implementation and its correctness flow gracefully from that meaning?</p
      ></blockquote
    ></div
  ></div
>

<div id="higher-order-derivatives"
><h3
  >Higher-order derivatives</h3
  ><p
  >Given answers to the first two questions, let's, turn to the third:</p
  ><blockquote
  ><p
    ><em
      >Where</em
      > else might we go, guided by answers to the first two questions?</p
    ></blockquote
  ><p
  >Jerzy Karczmarczuk extended the <code
    >D</code
    > representation above to an infinite &quot;lazy tower of derivatives&quot;, in the paper <em
    ><a href="http://citeseer.ist.psu.edu/karczmarczuk98functional.html" title="ICFP '98 paper by Jerzy Karczmarczuk"
      >Functional Differentiation of Computer Programs</a
      ></em
    >.</p
  ><pre class="sourceCode haskell"
  ><code
    ><span class="kw"
      >data</span
      > <span class="dt"
      >D</span
      > a <span class="fu"
      >=</span
      > <span class="dt"
      >D</span
      > a (<span class="dt"
      >D</span
      > a)<br
       /></code
    ></pre
  ><p
  >The <code
    >withD</code
    > function easily adapts to this new <code
    >D</code
    > type:</p
  ><pre class="sourceCode haskell"
  ><code
    >withD <span class="dv"
      >&#8759;</span
      > &#8943; &#8658; (a &#8594; a) &#8594; (a &#8594; <span class="dt"
      >D</span
      > a)<br
       />withD f x <span class="fu"
      >=</span
      > <span class="dt"
      >D</span
      > (f x) (withD (deriv f) x)<br
       /></code
    ></pre
  ><p
  >or</p
  ><pre class="sourceCode haskell"
  ><code
    >withD f <span class="fu"
      >=</span
      > liftA2 <span class="dt"
      >D</span
      > f (withD (deriv f))<br
       /></code
    ></pre
  ><p
  >These definitions were not brilliant insights. I looked for the simplest, type-correct possibility (without using ⊥).</p
  ><p
  >Similarly, I'll try tweaking the previous derivations and see what pops out.</p
  ><div id="addition-1"
  ><h4
    >Addition</h4
    ><p
    >Left-hand side:</p
    ><pre class="sourceCode haskell"
    ><code
      >   withD (u <span class="fu"
    >+</span
    > v)<br
     />&#8801;   <span class="co"
    >{- def of withD -}</span
    ><br
     />   liftA2 <span class="dt"
    >D</span
    > (u <span class="fu"
    >+</span
    > v) (withD (deriv (u <span class="fu"
    >+</span
    > v)))<br
     />&#8801;   <span class="co"
    >{- deriv rule for (+) -}</span
    ><br
     />   liftA2 <span class="dt"
    >D</span
    > (u <span class="fu"
    >+</span
    > v) (withD (deriv u <span class="fu"
    >+</span
    > deriv v))<br
     />&#8801;   <span class="co"
    >{- (fixed-point) induction withD and (+) -}</span
    ><br
     />   liftA2 <span class="dt"
    >D</span
    > (u <span class="fu"
    >+</span
    > v) (withD (deriv u) <span class="fu"
    >+</span
    > withD (deriv v))<br
     />&#8801;   <span class="co"
    >{- def of liftA2 and (+) on functions -}</span
    ><br
     />   &#955; x &#8594; <span class="dt"
    >D</span
    > (u x <span class="fu"
    >+</span
    > v x) (withD (deriv u) x <span class="fu"
    >+</span
    > withD (deriv v) x)<br
     /></code
      ></pre
    ><p
    >Right-hand side:</p
    ><pre class="sourceCode haskell"
    ><code
      >   withD u <span class="fu"
    >+</span
    > withD v<br
     />&#8801;   <span class="co"
    >{- (+) on functions -}</span
    ><br
     />   &#955; x &#8594; withD u x <span class="fu"
    >+</span
    > withD v x<br
     />&#8801;   <span class="co"
    >{- def of withD -}</span
    ><br
     />   &#955; x &#8594; <span class="dt"
    >D</span
    > (u x) (withD (deriv u x)) <span class="fu"
    >+</span
    > <span class="dt"
    >D</span
    > (v x) (withD (deriv v x))<br
     /></code
      ></pre
    ><p
    >Again, we need a definition of <code
      >(+)</code
      > on <code
      >D</code
      > that makes the LHS and RHS final forms equal, i.e.,</p
    ><pre class="sourceCode haskell"
    ><code
      >   &#955; x &#8594; <span class="dt"
    >D</span
    > (u x <span class="fu"
    >+</span
    > v x) (withD (deriv u) x <span class="fu"
    >+</span
    > with (deriv v) x)<br
     />&#8801;<br
     />   &#955; x &#8594; <span class="dt"
    >D</span
    > (u x) (withD (deriv u) x) <span class="fu"
    >+</span
    > <span class="dt"
    >D</span
    > (v x) (withD (deriv v) x)<br
     /></code
      ></pre
    ><p
    >Again, an easy choice is</p
    ><pre class="sourceCode haskell"
    ><code
      ><span class="dt"
    >D</span
    > a a' <span class="fu"
    >+</span
    > <span class="dt"
    >D</span
    > b b' <span class="fu"
    >=</span
    > <span class="dt"
    >D</span
    > (a <span class="fu"
    >+</span
    > b) (a' <span class="fu"
    >+</span
    > b')<br
     /></code
      ></pre
    ></div
  ><div id="multiplication-1"
  ><h4
    >Multiplication</h4
    ><p
    >Left-hand side:</p
    ><pre class="sourceCode haskell"
    ><code
      >   withD (u <span class="fu"
    >*</span
    > v)<br
     />&#8801;   <span class="co"
    >{- def of withD -}</span
    ><br
     />   liftA2 <span class="dt"
    >D</span
    > (u <span class="fu"
    >*</span
    > v) (withD (deriv (u <span class="fu"
    >*</span
    > v)))<br
     />&#8801;   <span class="co"
    >{- deriv rule for (*) -}</span
    ><br
     />   liftA2 <span class="dt"
    >D</span
    > (u <span class="fu"
    >*</span
    > v) (withD (deriv u <span class="fu"
    >*</span
    > v <span class="fu"
    >+</span
    > deriv v <span class="fu"
    >*</span
    > u))<br
     />&#8801;   <span class="co"
    >{- induction for withD/(+) -}</span
    ><br
     />   liftA2 <span class="dt"
    >D</span
    > (u <span class="fu"
    >*</span
    > v) (withD (deriv u <span class="fu"
    >*</span
    > v) <span class="fu"
    >+</span
    > withD (deriv v <span class="fu"
    >*</span
    > u))<br
     />&#8801;   <span class="co"
    >{- induction for withD/(*) -}</span
    ><br
     />   liftA2 <span class="dt"
    >D</span
    > (u <span class="fu"
    >*</span
    > v) (withD (deriv u) <span class="fu"
    >*</span
    > withD v <span class="fu"
    >+</span
    > withD (deriv v) <span class="fu"
    >*</span
    > withD u)<br
     />&#8801;   <span class="co"
    >{- liftA2, (*), (+) on functions -}</span
    ><br
     />   &#955; x &#8594; liftA2 <span class="dt"
    >D</span
    > (u x <span class="fu"
    >*</span
    > v x) (withD (deriv u) x <span class="fu"
    >*</span
    > withD v x <span class="fu"
    >+</span
    > withD (deriv v) x <span class="fu"
    >*</span
    > withD u x)<br
     /></code
      ></pre
    ><p
    >Right-hand side:</p
    ><pre class="sourceCode haskell"
    ><code
      >   withD u <span class="fu"
    >*</span
    > withD v<br
     />&#8801;   <span class="co"
    >{- def of withD -}</span
    ><br
     />   liftA2 <span class="dt"
    >D</span
    > u (withD (deriv u)) <span class="fu"
    >*</span
    > liftA2 <span class="dt"
    >D</span
    > v (withD (deriv v))<br
     />&#8801;   <span class="co"
    >{- liftA2 and (*) on functions -}</span
    ><br
     />   &#955; x &#8594; <span class="dt"
    >D</span
    > (u x) (withD (deriv u) x) <span class="fu"
    >*</span
    > <span class="dt"
    >D</span
    > (v x) (withD (deriv v) x)<br
     /></code
      ></pre
    ><p
    >A sufficient definition:</p
    ><pre class="sourceCode haskell"
    ><code
      >a<span class="fu"
    >@</span
    >(<span class="dt"
    >D</span
    > a0 a') <span class="fu"
    >*</span
    > b<span class="fu"
    >@</span
    >(<span class="dt"
    >D</span
    > b0 b') <span class="fu"
    >=</span
    > <span class="dt"
    >D</span
    > (a0 <span class="fu"
    >+</span
    > b0) (a' <span class="fu"
    >*</span
    > b <span class="fu"
    >+</span
    > b' <span class="fu"
    >*</span
    > a)<br
     /></code
      ></pre
    ><p
    >Because</p
    ><pre class="sourceCode haskell"
    ><code
      >withD u x &#8801; <span class="dt"
    >D</span
    > (u x) (withD (deriv u) x)<br
     /><br
     />withD v x &#8801; <span class="dt"
    >D</span
    > (v x) (withD (deriv v) x)<br
     /></code
      ></pre
    ></div
  ><div id="sine-1"
  ><h4
    >Sine</h4
    ><p
    >Left-hand side:</p
    ><pre class="sourceCode haskell"
    ><code
      >   withD (<span class="fu"
    >sin</span
    > u)<br
     />&#8801;   <span class="co"
    >{- def of withD -}</span
    ><br
     />   liftA2 <span class="dt"
    >D</span
    > (<span class="fu"
    >sin</span
    > u) (withD (deriv (<span class="fu"
    >sin</span
    > u)))<br
     />&#8801;   <span class="co"
    >{- deriv rule for sin -}</span
    ><br
     />   liftA2 <span class="dt"
    >D</span
    > (<span class="fu"
    >sin</span
    > u) (withD (deriv u <span class="fu"
    >*</span
    > <span class="fu"
    >cos</span
    > u))<br
     />&#8801;   <span class="co"
    >{- induction for withD/(*) -}</span
    ><br
     />   liftA2 <span class="dt"
    >D</span
    > (<span class="fu"
    >sin</span
    > u) (withD (deriv u) <span class="fu"
    >*</span
    > withD (<span class="fu"
    >cos</span
    > u))<br
     />&#8801;   <span class="co"
    >{- induction for withD/cos -}</span
    ><br
     />   liftA2 <span class="dt"
    >D</span
    > (<span class="fu"
    >sin</span
    > u) (withD (deriv u) <span class="fu"
    >*</span
    > <span class="fu"
    >cos</span
    > (withD u))<br
     />&#8801;   <span class="co"
    >{- liftA2, sin, cos and (*) on functions -}</span
    ><br
     />   &#955; x &#8594; <span class="dt"
    >D</span
    > (<span class="fu"
    >sin</span
    > (u x)) (withD (deriv u) x <span class="fu"
    >*</span
    > <span class="fu"
    >cos</span
    > (withD u x))<br
     /></code
      ></pre
    ><p
    >Right-hand side:</p
    ><pre class="sourceCode haskell"
    ><code
      >   <span class="fu"
    >sin</span
    > (withD u)<br
     />&#8801;   <span class="co"
    >{- def of withD -}</span
    ><br
     />   <span class="fu"
    >sin</span
    > (liftA2 <span class="dt"
    >D</span
    > u (withD (deriv u)))<br
     />&#8801;   <span class="co"
    >{- liftA2 and sin on functions -}</span
    ><br
     />   &#955; x &#8594; <span class="fu"
    >sin</span
    > (<span class="dt"
    >D</span
    > (u x) (withD (deriv u) x))<br
     /></code
      ></pre
    ><p
    >To make the LHS and RHS final forms equal, define</p
    ><pre class="sourceCode haskell"
    ><code
      ><span class="fu"
    >sin</span
    > a<span class="fu"
    >@</span
    >(<span class="dt"
    >D</span
    > a0 a') &#8801; <span class="dt"
    >D</span
    > (<span class="fu"
    >sin</span
    > a0) (a' <span class="fu"
    >*</span
    > <span class="fu"
    >cos</span
    > a)<br
     /></code
      ></pre
    ></div
  ></div
>

<div id="higher-dimensional-derivatives"
><h3
  >Higher-dimensional derivatives</h3
  ><p
  >I'll save non-scalar (&quot;multi-variate&quot;) differentiation for another time. In addition to the considerations above, the key ideas are in <em
    ><a href="http://conal.net/blog/posts/higher-dimensional-higher-order-derivatives-functionally/" title="blog post"
      >Higher-dimensional, higher-order derivatives, functionally</a
      ></em
    > and <em
    ><a href="http://conal.net/blog/posts/simpler-more-efficient-functional-linear-maps/" title="blog post"
      >Simpler, more efficient, functional linear maps</a
      ></em
    >.</p
  ></div
>
<p><a href="http://conal.net/blog/?flattrss_redirect&amp;id=79&amp;md5=868d8f06073e4586b26d7e6de7c7e848"><img src="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png" srcset="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@2x.png 2xhttp://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@3x.png 3x" alt="Flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://conal.net/blog/posts/what-is-automatic-differentiation-and-why-does-it-work/feed</wfw:commentRss>
		<slash:comments>5</slash:comments>
		<atom:link rel="payment" title="Flattr this!" href="https://flattr.com/submit/auto?user_id=conal&amp;popout=1&amp;url=http%3A%2F%2Fconal.net%2Fblog%2Fposts%2Fwhat-is-automatic-differentiation-and-why-does-it-work&amp;language=en_GB&amp;category=text&amp;title=What+is+automatic+differentiation%2C+and+why+does+it+work%3F&amp;description=Bertrand+Russell+remarked+that+Everything+is+vague+to+a+degree+you+do+not+realize+till+you+have+tried+to+make+it+precise.+I%26%238217%3Bm+mulling+over+automatic+differentiation+%28AD%29+again%2C+neatening...&amp;tags=derivative%2Csemantics%2Ctype+class+morphism%2Cblog" type="text/html" />
	</item>
		<item>
		<title>3D rendering as functional reactive programming</title>
		<link>http://conal.net/blog/posts/3d-rendering-as-functional-reactive-programming</link>
		<comments>http://conal.net/blog/posts/3d-rendering-as-functional-reactive-programming#comments</comments>
		<pubDate>Mon, 12 Jan 2009 05:38:58 +0000</pubDate>
		<dc:creator><![CDATA[Conal]]></dc:creator>
				<category><![CDATA[Functional programming]]></category>
		<category><![CDATA[3D]]></category>
		<category><![CDATA[FRP]]></category>
		<category><![CDATA[functional reactive programming]]></category>
		<category><![CDATA[monoid]]></category>
		<category><![CDATA[semantics]]></category>

		<guid isPermaLink="false">http://conal.net/blog/?p=75</guid>
		<description><![CDATA[I&#8217;ve been playing with a simple/general semantics for 3D. In the process, I was surprised to see that a key part of the semantics looks exactly like a key part of the semantics of functional reactivity as embodied in the library Reactive. A closer look revealed a closer connection still, as described in this post. [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- 

Title: 3D rendering as functional reactive programming

Tags: 3D, semantics, FRP, functional reactive programming, monoid

nURL: http://conal.net/blog/posts/3d-rendering-as-functional-reactive-programming/

-->

<!-- references -->

<!-- teaser -->

<p>I&#8217;ve been playing with a simple/general semantics for 3D.
In the process, I was surprised to see that a key part of the semantics looks exactly like a key part of the semantics of functional reactivity as embodied in the library <em><a href="http://haskell.org/haskellwiki/Reactive" title="Wiki page for the Reactive library">Reactive</a></em>.
A closer look revealed a closer connection still, as described in this post.</p>

<!--
**Edits**:

* 2008-02-09: just fiddling around
-->

<!-- without a comment or something here, the last item above becomes a paragraph -->

<p><span id="more-75"></span></p>

<h3>What is 3D rendering?</h3>

<p>Most programmers think of 3D rendering as being about executing sequence of side-effects on frame buffer or some other mutable array of pixels.
This way of thinking (sequences of side-effects) comes to us from the design of early sequential computers.
Although computer hardware architecture has evolved a great deal, most programming languages, and hence most programming thinking, is still shaped by the this first sequential model.
(See John Backus&#8217;s Turing Award lecture <em><a href="www.stanford.edu/class/cs242/readings/backus.pdf" title="Turing Award lecture by John Backus">Can Programming Be Liberated from the von Neumann Style?  A functional style and its algebra of programs</a></em>.)
The invention of monadic <em><a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.53.2504" title="paper by Simon Peyton Jones and Philip Wadler">Imperative functional programming</a></em> allows Haskellers to think and program within the imperative paradigm as well.</p>

<p>What&#8217;s a <em>functional</em> alternative?
Rendering is a function from something to something else.
Let&#8217;s call these somethings (3D) &#8220;Geometry&#8221; and (2D) &#8220;Image&#8221;, where <code>Geometry</code> and <code>Image</code> are types of functional (immutable) values.</p>

<pre><code>type Rendering = Image Color

render :: Geometry -&gt; Rendering
</code></pre>

<p>To simplify, I&#8217;m assuming a fixed view.
What remains is to define what these two types <em>mean</em> and, secondarily, how to represent and implement them.</p>

<p>An upcoming post will suggest an answer for the meaning of <code>Geometry</code>.
For now, think of it as a collection of curved and polygonal surfaces, i.e., the <em>outsides</em> (boundaries) of solid shapes.
Each point on these surfaces has a location, a normal (perpendicular direction), and material properties (determining how light is reflected by and transmitted through the surface at the point).
The geometry will contain light sources.</p>

<p>Next, what is the meaning of <code>Image</code>?
A popular answer is that an image is a rectangular array of finite-precision encodings of color (e.g., with eight bits for each of red, blue, green and possibly opacity).
This answer is leads to poor compositionality and complex meanings for operations like scaling and rotation, so I prefer another model.
As in <a href="http://conal.net/Pan" title="project web page">Pan</a>, an image (the meaning of the type <code>Image Color</code>) is a function from infinite continuous 2D space to colors, where the <code>Color</code> type includes partial opacity.
For motivation of this model and examples of its use, see <em><a href="http://conal.net/papers/functional-images/" title="book chapter">Functional images</a></em> and the corresponding <a href="http://conal.net/Pan/Gallery" title="gallery of functional images">Pan gallery</a> of functional images.
<em>Composition</em> occurs on infinite &amp; continuous images.</p>

<p>After all composition is done, the resulting image can be sampled into a finite, rectangular array of finite precision color encodings.
I&#8217;m talking about a conceptual/semantic pipeline.
The implementation computes the finite sampling without having to compute the values for entire infinite image.</p>

<p>Rendering has several components.
I&#8217;ll just address one and show how it relates to functional reactive programming (FRP).</p>

<h3>Visual occlusion</h3>

<p>One aspect of 3D rendering is <a href="en.wikipedia.org/wiki/Hidden_surface_determination">hidden surface determination</a>.
Relative to the viewer&#8217;s position and orientation, some 3D objects may fully or partially occluded by nearer objects.</p>

<p>An image is a function of (infinite and continuous) 2D space, so specifying that function is determining its value at every sample point.
Each point can correspond to a number of geometric objects, some closer and some further.
If we assume for now that our colors are fully opaque, then we&#8217;ll need to know the color (after transformation and lighting) of the <em>nearest</em> surface point that is projected onto the sample point.
(We&#8217;ll remove this opacity assumption later.)</p>

<p>Let&#8217;s consider how we&#8217;ll combine two <code>Geometry</code> values into one:</p>

<pre><code>union :: Geometry -&gt; Geometry -&gt; Geometry
</code></pre>

<p>Because of occlusion, the <code>render</code> function cannot be compositional with respect to <code>union</code>.
If it were, then there would exist a functions <code>unionR</code> such that</p>

<pre><code>forall ga gb. render (ga `union` gb) == render ga `unionR` render gb
</code></pre>

<p>In other words, to render a union of two geometries, we can render each and combine the results.</p>

<p>The reason we can&#8217;t find such a <code>unionR</code> is that <code>render</code> doesn&#8217;t let <code>unionR</code> know how close each colored point is.
A solution then is simple: add in the missing depth information:</p>

<pre><code>type RenderingD = Image (Depth, Color)  -- first try

renderD :: Geometry -&gt; RenderingD
</code></pre>

<p>Now we have enough information for compositional rendering, i.e., we can define <code>unionR</code> such that</p>

<pre><code>forall ga gb. renderD (ga `union` gb) == renderD ga `unionR` renderD gb
</code></pre>

<p>where</p>

<pre><code>unionR :: RenderingD -&gt; RenderingD -&gt; RenderingD

unionR im im' p = if d &lt;= d' then (d,c) else (d',c')
 where
   (d ,c ) = im  p
   (d',c') = im' p
</code></pre>

<p>When we&#8217;re done composing, we can discard the depths:</p>

<pre><code>render g = snd . renderD g
</code></pre>

<p>or, with <em><a href="http://conal.net/blog/posts/semantic-editor-combinators/" title="blog post">Semantic editor combinators</a></em>:</p>

<pre><code>render = (result.result) snd renderD
</code></pre>

<h3>Simpler, prettier</h3>

<p>The <code>unionR</code> is not very complicated, but still, I like to tease out common structure and reuse definitions wherever I can.
The first thing I notice about <code>unionR</code> is that it works pointwise.
That is, the value at a point is a function of the values of two other images at the same point.
The pattern is captured by <code>liftA2</code> on functions, thanks to the <code>Applicative</code> instance for functions.</p>

<pre><code>liftA2 :: (b -&gt; c -&gt; d) -&gt; (a -&gt; b) -&gt; (a -&gt; c) -&gt; (a -&gt; d)
</code></pre>

<p>So that</p>

<pre><code>unionR = liftA2 closer

closer (d,c) (d',c') = if d &lt;= d' then (d,c) else (d',c')
</code></pre>

<p>Or</p>

<pre><code>closer dc@(d,_) dc'@(d',_) = if d &lt;= d' then dc else dc'
</code></pre>

<p>Or even</p>

<pre><code>closer = minBy fst
</code></pre>

<p>where</p>

<pre><code>minBy f u v = if f u &lt;= f v then u else v
</code></pre>

<p>This definition of <code>unionR</code> is not only simpler, it&#8217;s quite a bit more general, as type inference reveals:</p>

<pre><code>unionR :: (Ord a, Applicative f) =&gt; f (a,b) -&gt; f (a,b) -&gt; f (a,b)

closer :: Ord a =&gt; (a,b) -&gt; (a,b) -&gt; (a,b)
</code></pre>

<p>Once again, simplicity and generality go hand-in-hand.</p>

<h3>Another type class morphism</h3>

<p>Let&#8217;s see if we can make <code>union</code> rendering simpler and more inevitable.
Rendering is <em>nearly</em> a homomorphism.
That is, <code>render</code> nearly distributes over <code>union</code>, but we have to replace <code>union</code> by <code>unionR</code>.
I&#8217;d rather eliminate this discrepancy, ending up with</p>

<pre><code>forall ga gb. renderD (ga `op` gb) == renderD ga `op` renderD gb
</code></pre>

<p>for some <code>op</code> that is equal to <code>union</code> on the left and <code>unionR</code> on the right.
Since <code>union</code> and <code>unionR</code> have different types (with neither being a polymorphic instance of the other), <code>op</code> will have to be a method of some type class.</p>

<p>My favorite binary method is <code>mappend</code>, from <code>Monoid</code>, so let&#8217;s give it a try.
<code>Monoid</code> requires there also to be an identity element <code>mempty</code> and that <code>mappend</code> be associative.
For <code>Geometry</code>, we can define</p>

<pre><code>instance Monoid Geometry where
  mempty  = emptyGeometry
  mappend = union
</code></pre>

<p>Images with depth are a little trickier.
Image already has a <code>Monoid</code> instance, whose semantics is determined by the principle of <a href="http://conal.net/blog/tag/type-class-morphism/" title="Posts on type class morphisms">type class morphisms</a>, namely</p>

<blockquote>
  <p><em>The meaning of an instance is the instance of the meaning</em></p>
</blockquote>

<p>The meaning of an image is a function, and that functions have a <code>Monoid</code> instance:</p>

<pre><code>instance Monoid b =&gt; Monoid (a -&gt; b) where
  mempty = const mempty
  f `mappend` g =  a -&gt; f a `mappend` g a
</code></pre>

<p>which simplifies nicely to a standard form, by using the <code>Applicative</code> instance for functions.</p>

<pre><code>instance Applicative ((-&gt;) a) where
  pure      = const
  hf &lt;*&gt; xf =  a -&gt; (hf a) (xf a)

instance Monoid b =&gt; Monoid (a -&gt; b) where
  mempty  = pure   mempty
  mappend = liftA2 mappend
</code></pre>

<p>We&#8217;re in luck.
Since we&#8217;ve defined <code>unionR</code> as <code>liftA2 closer</code>, so we just need it to turn out that <code>closer == mappend</code> and that <code>closer</code> is associative and has an identity element.</p>

<p>However, <code>closer</code> is defined on pairs, and the standard <code>Monoid</code> instance on pairs doesn&#8217;t fit.</p>

<pre><code>instance (Monoid a, Monoid b) =&gt; Monoid (a,b) where
  mempty = (mempty,mempty)
  (a,b) `mappend` (a',b') = (a `mappend` a', b `mappend` b')
</code></pre>

<p>To avoid this conflict, define a new data type to be used in place of pairs.</p>

<pre><code>data DepthG d a = Depth d a  -- first try
</code></pre>

<p>Alternatively,</p>

<pre><code>newtype DepthG d a = Depth { unDepth :: (d,a) }
</code></pre>

<p>I&#8217;ll go with this latter version, as it turns out to be more convenient.</p>

<p>Then we can define our monoid:</p>

<pre><code>instance Monoid (DepthG d a) where
  mempty  = Depth (maxBound,undefined)
  Depth p `mappend` Depth p' = Depth (p `closer` p')
</code></pre>

<p>The second method definition can be simplified nicely</p>

<pre><code>  mappend = inDepth2 closer
</code></pre>

<p>where</p>

<pre><code>  inDepth2 = unDepth ~&gt; unDepth ~&gt; Depth
</code></pre>

<p>using the ideas from <em><a href="http://conal.net/blog/posts/prettier-functions-for-wrapping-and-wrapping/" title="blog post">Prettier functions for wrapping and wrapping</a></em> and the notational improvement from Matt Hellige&#8217;s <em><a href="http://matt.immute.net/content/pointless-fun" title="blog post by Matt Hellige">Pointless fun</a></em>.</p>

<h3>FRP &#8212; Future values</h3>

<p>The <code>Monoid</code> instance for <code>Depth</code> may look familiar to you if you&#8217;ve been following along with my <a href="http://conal.net/blog/tag/future-value/" title="Posts on futures values">future value</a>s or have read the paper <em><a href="http://conal.net/papers/simply-reactive" title="Paper: &quot;Simply efficient functional reactivity&quot;">Simply efficient functional reactivity</a></em>.
A <em>future value</em> has a time and a value.
Usually, the value cannot be known until its time arrives.</p>

<pre><code>newtype FutureG t a = Future (Time t, a)

instance (Ord t, Bounded t) =&gt; Monoid (FutureG t a) where
  mempty = Future (maxBound, undefined)
  Future (s,a) `mappend` Future (t,b) =
    Future (s `min` t, if s &lt;= t then a else b)
</code></pre>

<p>When we&#8217;re using a non-lazy (flat) representation of time, this <code>mappend</code> definition can be written more simply:</p>

<pre><code>  mappend = minBy futTime

  futTime (Future (t,_)) = t
</code></pre>

<p>Equivalently,</p>

<pre><code>  mappend = inFuture2 (minBy fst)
</code></pre>

<p>The <code>Time</code> type is really nothing special about time.
It is just a synonym for the <a href="http://hackage.haskell.org/packages/archive/reactive/latest/doc/html/Data-Max.html" title="module documentation"><code>Max</code> monoid</a>, as needed for the <code>Applicative</code> and <code>Monad</code> instances.</p>

<p>This connection with future values means we can discard more code.</p>

<pre><code>type RenderingD d = Image (FutureG d Color)
renderD :: (Ord d, Bounded d) =&gt; Geometry -&gt; RenderingD d
</code></pre>

<p>Now we have our monoid (homo)morphism properties:</p>

<pre><code>renderD mempty == mempty

renderD (ga `mappend` gb) == renderD ga `mappend` renderD gb
</code></pre>

<p>And we&#8217;ve eliminated the code for <code>renderR</code> by reusing and existing type (future values).</p>

<h3>Future values?</h3>

<p>What does it mean to think about depth/color pairs as being &#8220;future&#8221; colors?
If we were to probe outward along a ray, say at the speed of light, we would bump into some number of 3D objects.
The one we hit earliest is the nearest, so in this sense, <code>mappend</code> on futures (choosing the earlier one) is the right tool for the job.</p>

<p>I once read that a popular belief in the past was that vision (light) reaches outward to strike objects, as I&#8217;ve just described.
I&#8217;ve forgotten where I read about that belief, though I think in a book about perspective, and I&#8217;d appreciate a pointer from someone else who might have a reference.</p>

<p>We moderns believe that light travels to us from the objects we see.
What we see of nearby objects comes from the very recent past, while of further objects we see the more remote past.
From this modern perspective, therefore, the connection I&#8217;ve made with future values is exactly backward.
Now that I think about it in this way, of course it&#8217;s backward, because we see (slightly) into the past rather than the future.</p>

<p>Fixing this conceptual flaw is simple: define a type of &#8220;past values&#8221;.
Give them exactly the same representation as future values, and deriving its class instances entirely.</p>

<pre><code>newtype PastG t a = Past (FutureG t a)
  deriving (Monoid, Functor, Applicative, Monad)
</code></pre>

<p>Alternatively, choose a temporally neutral replacement for the name &#8220;future values&#8221;.</p>

<h3>The bug in Z-buffering</h3>

<p>The <code>renderD</code> function implements continuous, infinite Z-buffering, with <code>mappend</code> performing the z-compare and conditional overwrite.
Z-buffering is the dominant algorithm used in real-time 3D graphics and is supported in hardware on even low-end graphics hardware (though not in its full continuous and infinite glory).</p>

<p>However, Z-buffering also has a serious bug: it is only correct for fully opaque colors.
Consider a geometry <code>g</code> and a point <code>p</code> in the domain of the result image.
There may be many different points in <code>g</code> that project to <code>p</code>.
If <code>g</code> has only fully opaque colors, then at most one place on <code>g</code> contributes to the rendered image at <code>p</code>, and specifically, the nearest such point.
If <code>g</code> is the <code>union</code> (<code>mappend</code>) of two other geometries, <code>g == ga `union` gb</code>, then the nearest contribution of <code>g</code> (for <code>p</code>) will be the nearer (<code>mappend</code>) of the nearest contributions of <code>ga</code> and of <code>gb</code>.</p>

<p>When colors may be <em>partially</em> opaque, the color of the rendering at a point <code>p</code> can depend on <em>all</em> of the points in the geometry that get projected to <code>p</code>.
Correct rendering in the presence of partial opacity requires a <code>fold</code> that combines all of the colors that project onto a point, <em>in order of distance</em>, where the color-combining function (alpha-blending) is <em>not</em> commutative.
Consider again <code>g == ga `union` gb</code>.
The contributions of <code>ga</code> to <code>p</code> might be entirely closer than the contributions of <code>gb</code>, or entirely further, or interleaved.
If interleaved, then the colors generated from each cannot be combined into a single color for further combination.
To handle the general case, replace the single distance/color pair with an ordered <em>collection</em> of them:</p>

<pre><code>type RenderingD d = Image [FutureG d Color]  -- multiple projections, first try
</code></pre>

<p>Rendering a <code>union</code> (<code>mappend</code>) requires a merging of two lists of futures (distance/color pairs) into a single one.</p>

<h3>More FRP &#8212; Events</h3>

<p>Sadly, we&#8217;ve now lost our monoid morphism, because list <code>mappend</code> is <code>(++)</code>, not the required merging.
However, we can fix this problem as we did before, by introducing a new type.</p>

<p>Or, we can look for an existing type that matches our required semantics.
There is just such a thing in the <em><a href="http://haskell.org/haskellwiki/Reactive" title="Wiki page for the Reactive library">Reactive</a></em> formulation of FRP, namely an <em>event</em>.
We can simply use the FRP <code>Event</code> type:</p>

<pre><code>type RenderingD d = Image (EventG d Color)

renderD :: (Ord d, Bounded d) =&gt; Geometry -&gt; RenderingD d
</code></pre>

<h3>Spatial transformation</h3>

<p>Introducing depths allowed rendering to be defined compositionally with respect to geometric union.
Is the depth model, enhanced with lists (events), sufficient for compositionality of rendering with respect to other <code>Geometry</code> operations as well?
Let&#8217;s look at spatial transformation.</p>

<pre><code>(*%)  :: Transform3 -&gt; Geometry -&gt; Geometry
</code></pre>

<p>Compositionally of rendering would mean that we can render <code>xf *% g</code> by rendering <code>g</code> and then using <code>xf</code> in some way to transform that rendering.
In other words there would have to exist a function <code>(*%%)</code> such that</p>

<pre><code>forall xf g. renderD (xf *% g) == xf *%% renderD g
</code></pre>

<p>I don&#8217;t know if the required <code>(*%%)</code> function exists, or what restrictions on <code>Geometry</code> or <code>Transform3</code> it implies, or whether such a function could be useful in practice.
Instead, let&#8217;s change the type of renderings again, so that rendering can accumulate transformations and apply them to surfaces.</p>

<pre><code>type RenderingDX = Transform3 -&gt; RenderingD

renderDX :: (Ord d, Bounded d) =&gt; Geometry -&gt; RenderingDX d
</code></pre>

<p>with or without correct treatment of partial opacity (i.e., using futures or events).</p>

<p>This new function has a simple specification:</p>

<pre><code>renderDX g xf == renderD (xf *% g)
</code></pre>

<p>from which it follows that</p>

<pre><code>renderD g == renderDX g identityX
</code></pre>

<p>Rendering a transformed geometry then is a simple accumulation, justified as follows:</p>

<pre><code>renderDX (xfi *% g)

  == {- specification of renderDX -}

 xfo -&gt; renderD (xfo *% (xfi *% g))

  == {- property of transformation -}

 xfo -&gt; renderD ((xfo `composeX` xfi) *% g)

  == {- specification of renderDX  -}

 xfo -&gt; renderDX g (xfo `composeX` xfi)
</code></pre>

<p>Render an empty geometry:</p>

<pre><code>renderDX mempty

  == {- specification of renderDX -}

 xf -&gt; renderD (xf *% mempty)

  == {- property of (*%) and mempty -}

 xf -&gt; renderD mempty

  == {- renderD is a monoid morphism -}

 xf -&gt; mempty

  == {- definition of pure on functions -}

pure mempty

  == {- definition of mempty on functions -}

mempty
</code></pre>

<p>Render a geometric union:</p>

<pre><code>renderDX (ga `mappend` gb)

  == {- specification of renderDX -}

 xf -&gt; renderD (xf *% (ga `mappend` gb))

  == {- property of transformation and union -}

 xf -&gt; renderD ((xf *% ga) `mappend` (xf *% gb))

  == {- renderD is a monoid morphism -}

 xf -&gt; renderD (xf *% ga) `mappend` renderD (xf *% gb)

  == {- specification of renderDX  -}

 xf -&gt; renderDX ga xf `mappend` renderDX gb xf

  == {- definition of liftA2/(&lt;*&gt;) on functions -}

liftA2 mappend (renderDX ga) (renderDX gb)

  == {- definition of mappend on functions -}

renderDX ga `mappend` renderDX gb
</code></pre>

<p>Hurray!
<code>renderDX</code> is still a monoid morphism.</p>

<p>The two properties of transformation and union used above say together that <code>(xf *%)</code> is a monoid morphism for all transforms <code>xf</code>.</p>
<p><a href="http://conal.net/blog/?flattrss_redirect&amp;id=75&amp;md5=a47ae6e1e1a51016836d913e562dbd3e"><img src="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png" srcset="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@2x.png 2xhttp://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@3x.png 3x" alt="Flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://conal.net/blog/posts/3d-rendering-as-functional-reactive-programming/feed</wfw:commentRss>
		<slash:comments>11</slash:comments>
		<atom:link rel="payment" title="Flattr this!" href="https://flattr.com/submit/auto?user_id=conal&amp;popout=1&amp;url=http%3A%2F%2Fconal.net%2Fblog%2Fposts%2F3d-rendering-as-functional-reactive-programming&amp;language=en_GB&amp;category=text&amp;title=3D+rendering+as+functional+reactive+programming&amp;description=I%26%238217%3Bve+been+playing+with+a+simple%2Fgeneral+semantics+for+3D.+In+the+process%2C+I+was+surprised+to+see+that+a+key+part+of+the+semantics+looks+exactly+like+a+key+part...&amp;tags=3D%2CFRP%2Cfunctional+reactive+programming%2Cmonoid%2Csemantics%2Cblog" type="text/html" />
	</item>
		<item>
		<title>Another lovely example of type class morphisms</title>
		<link>http://conal.net/blog/posts/another-lovely-example-of-type-class-morphisms</link>
		<comments>http://conal.net/blog/posts/another-lovely-example-of-type-class-morphisms#comments</comments>
		<pubDate>Fri, 14 Nov 2008 06:20:06 +0000</pubDate>
		<dc:creator><![CDATA[Conal]]></dc:creator>
				<category><![CDATA[Functional programming]]></category>
		<category><![CDATA[applicative functor]]></category>
		<category><![CDATA[fold]]></category>
		<category><![CDATA[functor]]></category>
		<category><![CDATA[semantics]]></category>
		<category><![CDATA[type class]]></category>
		<category><![CDATA[type class morphism]]></category>
		<category><![CDATA[zip]]></category>

		<guid isPermaLink="false">http://conal.net/blog/?p=58</guid>
		<description><![CDATA[I read Max Rabkin&#8217;s recent post Beautiful folding with great excitement. He shows how to make combine multiple folds over the same list into a single pass, which can then drastically reduce memory requirements of a lazy functional program. Max&#8217;s trick is giving folds a data representation and a way to combine representations that corresponds [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- 

Title: Another lovely example of type class morphisms

Tags: applicative functor, functor, type class morphism, semantics, type class, fold, zip

URL: http://conal.net/blog/posts/another-lovely-example-of-type-class-morphisms/

-->

<!-- references -->

<!-- teaser -->

<p>I read Max Rabkin&#8217;s recent post <a href="http://squing.blogspot.com/2008/11/beautiful-folding.html" title="blog post by Max Rabkin">Beautiful folding</a> with great excitement.
He shows how to make combine multiple folds over the same list into a single pass, which can then drastically reduce memory requirements of a lazy functional program.
Max&#8217;s trick is giving folds a data representation and a way to combine representations that corresponds to combining the folds.</p>

<p>Peeking out from behind Max&#8217;s definitions is a lovely pattern I&#8217;ve been noticing more and more over the last couple of years, namely <a href="http://conal.net/blog/posts/simplifying-semantics-with-type-class-morphisms" title="blog post">type class morphisms</a>.</p>

<!--
**Edits**:

* 2008-02-09: just fiddling around
-->

<!-- without a comment or something here, the last item above becomes a paragraph -->

<p><span id="more-58"></span></p>

<h3>Folds as data</h3>

<p>Max gives a data representation of folds and adds on an post-fold step, which makes them composable.</p>

<pre><code>data Fold b c = forall a. F (a -&gt; b -&gt; a) a (a -&gt; c)
</code></pre>

<p>The components of a <code>Fold</code> are a (strict left) fold&#8217;s combiner function and initial value, plus a post-fold step.
This interpretation is done by a function <code>cfoldl'</code>, which turns these data folds into function folds:</p>

<pre><code>cfoldl' :: Fold b c -&gt; [b] -&gt; c
cfoldl' (F op e k) = k . foldl' op e
</code></pre>

<p>where <code>foldl'</code> is the standard strict, left-fold functional:</p>

<pre><code>foldl' :: (a -&gt; b -&gt; a) -&gt; a -&gt; [b] -&gt; a
foldl' op a []     = a
foldl' op a (b:bs) =
  let a' = a `op` b in a' `seq` foldl' f a' bs
</code></pre>

<h3>Standard classes</h3>

<p>As Twan van Laarhoven pointed out in a comment on <a href="http://squing.blogspot.com/2008/11/beautiful-folding.html" title="blog post by Max Rabkin">Max&#8217;s post</a>, <code>Fold b</code> is a functor and an applicative functor, so some of Max&#8217;s <code>Fold</code>-manipulating functions can be replaced by standard vocabulary.</p>

<p>The <code>Functor</code> instance is pretty simple:</p>

<pre><code>instance Functor (Fold b) where
  fmap h (F op e k) = F op e (h . k)
</code></pre>

<p>The <code>Applicative</code> instance is a bit trickier.
For strictness, Max used used a type of strict pairs:</p>

<pre><code>data Pair c c' = P !c !c'
</code></pre>

<p>The instance:</p>

<pre><code>instance Applicative (Fold b) where
  pure a = F (error "no op") (error "no e") (const a)

  F op e k &lt;*&gt; F op' e' k' = F op'' e'' k''
   where
     P a a' `op''` b = P (a `op` b) (a' `op'` b)
     e''             = P e e'
     k'' (P a a')    = (k a) (k' a')
</code></pre>

<p>Given that <code>Fold b</code> is an applicative functor, Max&#8217;s <code>bothWith</code> function is then <code>liftA2</code>.
Max&#8217;s <code>multi-cfoldl'</code> rule then becomes:</p>

<pre><code>forall c f g xs.
   h (cfoldl' f xs) (cfoldl' g xs) == cfoldl' (liftA2 h f g) xs
</code></pre>

<p>thus replacing two passes with one pass.</p>

<h3>Beautiful properties</h3>

<p>Now here&#8217;s the fun part.
Looking at the <code>Applicative</code> instance for <code>((-&gt;) a)</code>, the rule above is equivalent to</p>

<pre><code>forall c f g.
  liftA2 h (cfoldl' f) (cfoldl' g) == cfoldl' (liftA2 h f g)
</code></pre>

<p>Flipped around, this rule says that <code>liftA2</code> distributes over <code>cfoldl'</code>.
Or, &#8220;the meaning of <code>liftA2</code> is <code>liftA2</code>&#8220;.
Neat, huh?</p>

<p>Moreover, this <code>liftA2</code> property is equivalent to the following:</p>

<pre><code>forall f g.
  cfoldl' f &lt;*&gt; cfoldl' g == cfoldl' (f &lt;*&gt; g)
</code></pre>

<p>This form is one of the two <code>Applicative</code> morphism laws (which I usually write in the reverse direction):</p>

<p>For more about these morphisms, see <a href="http://conal.net/blog/posts/simplifying-semantics-with-type-class-morphisms" title="blog post">Simplifying semantics with type class morphisms</a>.
That post suggests that semantic functions in particular ought to be type class morphisms (and if not, then you&#8217;d have an abstraction leak).
And <code>cfoldl'</code> is a semantic function, in that it gives meaning to a <code>Fold</code>.</p>

<p>The other type class morphisms in this case are</p>

<pre><code>cfoldl' (pure a  ) == pure a

cfoldl' (fmap h f) == fmap h (cfoldl' f)
</code></pre>

<p>Given the <code>Functor</code> and <code>Applicative</code> instances of <code>((-&gt;) a)</code>, these two properties are equivalent to</p>

<pre><code>cfoldl' (pure a  ) == const a

cfoldl' (fmap h f) == h . cfoldl' f
</code></pre>

<p>or</p>

<pre><code>cfoldl' (pure a  ) xs == a

cfoldl' (fmap h f) xs == h (cfoldl' f xs)
</code></pre>

<h3>Rewrite rules</h3>

<p>Max pointed out that GHC does not handle his original <code>multi-cfoldl'</code> rule.
The reason is that the head of the LHS (left-hand side) is a variable.
However, the type class morphism laws have constant (known) functions at the head, so I expect they could usefully act as fusion rewrite rules.</p>

<h3>Inevitable instances</h3>

<p>Given the implementations (instances) of <code>Functor</code> and <code>Applicative</code> for <code>Fold</code>, I&#8217;d like to verify that the morphism laws for <code>cfoldl'</code> (above) hold.</p>

<h4>Functor</h4>

<p>Start with <code>fmap</code>.
The morphism law:</p>

<pre><code>cfoldl' (fmap h f) == fmap h (cfoldl' f)
</code></pre>

<p>First, give the <code>Fold</code> argument more structure, so that (without loss of generality) the law becomes</p>

<pre><code>cfoldl' (fmap h (F op e k)) == fmap h (cfoldl' (F op e k))
</code></pre>

<p>The game is to work backward from this law to the definition of <code>fmap</code> for <code>Fold</code>.
I&#8217;ll do so by massaging the RHS (right-hand side) into the form <code>cfoldl' (...)</code>, where &#8220;<code>...</code>&#8221; is the definition <code>fmap h (F op e k)</code>.</p>

<pre><code>fmap h (cfoldl' (F op e k))

  ==  {- inline cfoldl' -}

fmap h (k . foldl' op e)

  ==  {- inline fmap on functions -}

h . (k . foldl' op e)

  ==  {- associativity of (.) -}

(h . k) . foldl' op e

  ==  {- uninline cfoldl' -}

cfoldl' (F op e (h . k))

  ==  {- uninline fmap on Fold  -}

cfoldl' (fmap h (F op e k))
</code></pre>

<p>This proof show why Max had to add the post-fold function <code>k</code> to his <code>Fold</code> type.
If <code>k</code> weren&#8217;t there, we couldn&#8217;t have buried the <code>h</code> in it.</p>

<p>More usefully, this proof suggests how we could have discovered the <code>fmap</code> definition.
For instance, we might have tried with a simpler and more obvious <code>Fold</code> representation:</p>

<pre><code>data FoldS a b = FS (a -&gt; b -&gt; a) a
</code></pre>

<p>Getting into the <code>fmap</code> derivation, we&#8217;d come to</p>

<pre><code>h . foldsl' op e
</code></pre>

<p>and then we&#8217;d be stuck.
But not really, because the awkward extra bit (<code>h .</code>) beckons us to generalize by adding Max&#8217;s post-fold function.</p>

<h4>Applicative</h4>

<p>Next pure:</p>

<pre><code>cfoldl' (pure a) == pure a
</code></pre>

<p>Reason as before, starting with the RHS</p>

<pre><code>pure a

  ==  {- inline pure on functions -}

const a

  ==  {- property of const -}

const a . foldl op e

  ==  {- uninline cfoldl' -}

cfoldl' (F (const a) op e)
</code></pre>

<p>The imaginative step was inventing structure to match the definition of <code>cfoldl'</code>.
This definition is true for <em>any</em> values of <code>op</code> and <code>e</code>, so we can use bottom in the definition:</p>

<pre><code>instance Applicative (Fold b) where
  pure a = F undefined undefined (const a)
</code></pre>

<p>As Twan noticed, the existential (<code>forall</code>) also lets us pick defined values for <code>op</code> and <code>e</code>.
He chose <code>(_ _ -&gt; ())</code> and <code>()</code>.</p>

<p>The derivation of <code>(&lt;*&gt;)</code> is trickier and is the heart of the problem of fusing folds to reduce multiple traversals to a single one.
Why the heart?  Because <code>(&lt;*&gt;)</code> is all about <em>combining</em> two things into one.</p>

<h4>Intermission</h4>

<p>I&#8217;m taking a break here.
While fiddling with a proof of the <code>(&lt;*&gt;)</code> morphism law, I realized a simpler way to structure these folds, which will be the topic of an upcoming post.</p>
<p><a href="http://conal.net/blog/?flattrss_redirect&amp;id=58&amp;md5=654d33555237a7a03d4b82df83d84d6f"><img src="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png" srcset="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@2x.png 2xhttp://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@3x.png 3x" alt="Flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://conal.net/blog/posts/another-lovely-example-of-type-class-morphisms/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		<atom:link rel="payment" title="Flattr this!" href="https://flattr.com/submit/auto?user_id=conal&amp;popout=1&amp;url=http%3A%2F%2Fconal.net%2Fblog%2Fposts%2Fanother-lovely-example-of-type-class-morphisms&amp;language=en_GB&amp;category=text&amp;title=Another+lovely+example+of+type+class+morphisms&amp;description=I+read+Max+Rabkin%26%238217%3Bs+recent+post+Beautiful+folding+with+great+excitement.+He+shows+how+to+make+combine+multiple+folds+over+the+same+list+into+a+single+pass%2C+which+can+then...&amp;tags=applicative+functor%2Cfold%2Cfunctor%2Csemantics%2Ctype+class%2Ctype+class+morphism%2Czip%2Cblog" type="text/html" />
	</item>
		<item>
		<title>Simplifying semantics with type class morphisms</title>
		<link>http://conal.net/blog/posts/simplifying-semantics-with-type-class-morphisms</link>
		<comments>http://conal.net/blog/posts/simplifying-semantics-with-type-class-morphisms#comments</comments>
		<pubDate>Wed, 09 Apr 2008 04:22:35 +0000</pubDate>
		<dc:creator><![CDATA[Conal]]></dc:creator>
				<category><![CDATA[Functional programming]]></category>
		<category><![CDATA[applicative functor]]></category>
		<category><![CDATA[FRP]]></category>
		<category><![CDATA[functional reactive programming]]></category>
		<category><![CDATA[functor]]></category>
		<category><![CDATA[monad]]></category>
		<category><![CDATA[monoid]]></category>
		<category><![CDATA[semantics]]></category>
		<category><![CDATA[type class]]></category>
		<category><![CDATA[type class morphism]]></category>

		<guid isPermaLink="false">http://conal.net/blog/?p=23</guid>
		<description><![CDATA[When I first started playing with functional reactivity in Fran and its predecessors, I didn&#8217;t realize that much of the functionality of events and reactive behaviors could be packaged via standard type classes. Then Conor McBride &#38; Ross Paterson introduced us to applicative functors, and I remembered using that pattern to reduce all of the [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- 

Title: Simplifying semantics with type class morphisms

Tags: type class, functor, applicative functor, monad, monoid, type class morphism, semantics, FRP, functional reactive programming

URL: http://conal.net/blog/posts/simplifying-semantics-with-type-class-morphisms

-->

<!-- references -->

<!-- teaser -->

<p>When I first started playing with functional reactivity in Fran and its predecessors, I didn&#8217;t realize that much of the functionality of events and reactive behaviors could be packaged via standard type classes.
Then Conor McBride &amp; Ross Paterson introduced us to <em><a href="http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Applicative.html" title="Documentation for Control.Applicative: applicative functors">applicative functors</a></em>, and I remembered using that pattern to reduce all of the lifting operators in Fran to just two, which correspond to <code>pure</code> and <code>(&lt;*&gt;)</code> in the <code>Applicative</code> class.
So, in working on a new library for functional reactive programming (FRP), I thought I&#8217;d modernize the interface to use standard type classes as much as possible.</p>

<p>While spelling out a precise (denotational) semantics for the FRP instances of these classes, I noticed a lovely recurring pattern:</p>

<blockquote>
  <p>The meaning of each method corresponds to the same method for the meaning.</p>
</blockquote>

<p>In this post, I&#8217;ll give some examples of this principle and muse a bit over its usefulness.
For more details, see the paper <em><a href="http://conal.net/blog/posts/simply-efficient-functional-reactivity/" title="Blog post: &quot;Simply efficient functional reactivity&quot;">Simply efficient functional reactivity</a></em>.
Another post will start exploring type class morphisms and type composition, and ask questions I&#8217;m wondering about.</p>

<!--
**Edits**:

* 2008-02-09: just fiddling around
-->

<!-- without a comment or something here, the last item above becomes a paragraph -->

<p><span id="more-23"></span></p>

<h3>Behaviors</h3>

<p>The meaning of a (reactive) behavior is a function from time:</p>

<pre><code>type B a = Time -&gt; a

at :: Behavior a -&gt; B a
</code></pre>

<p>So the semantic function, <code>at</code>, maps from the <code>Behavior</code> type (for use in FRP programs) to the <code>B</code> type (for understanding FRP programs)</p>

<p>As a simple example, the meaning of the behavior <code>time</code> is the identity function:</p>

<pre><code>at time == id
</code></pre>

<h4>Functor</h4>

<p>Given <code>b :: Behavior a</code> and a function <code>f :: a -&gt; b</code>, we can apply <code>f</code> to the value of <code>b</code> at every moment in (infinite and continuous) time.
This operation corresponds to the <code>Functor</code> method <code>fmap</code>, so</p>

<pre><code>instance Functor Behavior where ...
</code></pre>

<p>The informal description of <code>fmap</code> on behavior translates to a formal definition of its semantics:</p>

<pre><code>  fmap f b `at` t == f (b `at` t)
</code></pre>

<p>Equivalently,</p>

<pre><code>  at (fmap f b) ==  t -&gt; f (b `at` t)
                == f . ( t -&gt; b `at` t)
                == f . at b
</code></pre>

<p>Now here&#8217;s the fun part.
While <code>Behavior</code> is a functor, <em>so is its meaning</em>:</p>

<pre><code>instance Functor ((-&gt;) t) where fmap = (.)
</code></pre>

<p>So, replacing <code>f . at b</code> with <code>fmap f (at b)</code> above,</p>

<pre><code>  at (fmap f b) == fmap f (at b)
</code></pre>

<p>which can also be written</p>

<pre><code>  at . fmap f == fmap f . at
</code></pre>

<p>Keep in mind here than the <code>fmap</code> on the left is on behaviors, and on the right is functions (of time).</p>

<p>This last equation can also be written as a simple square commutative diagram and is sometimes expressed by saying that <code>at</code> is a &#8220;natural transformation&#8221; or &#8220;morphism on functors&#8221; [<a href="http://books.google.com/books?id=eBvhyc4z8HQC" title="Book: &quot;Categories for the Working Mathematician&quot; by Saunders Mac Lane">Categories for the Working Mathematician</a>].
For consistency with similar properties on other type classes, I suggest &#8220;functor morphism&#8221; as a synonym for natural transformation.</p>

<p>The <a href="http://www.haskell.org/haskellwiki/Category_theory/Natural_transformation" title="Haskell wiki page: &quot;Haskell wiki page on natural transformations&quot;">Haskell wiki page on natural transformations</a> shows the commutative diagram and gives <code>maybeToList</code> as another example.</p>

<h4>Applicative functor</h4>

<p>The <code>fmap</code> method applies a static (not time-varying) function to a dynamic (time-varying) argument.
A more general operation applies a dynamic function to a dynamic argument.
Also useful is promoting a static value to a dynamic one.
These two operations correspond to <code>(&lt;*&gt;)</code> and <code>pure</code> for <a href="http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Applicative.html" title="Documentation for Control.Applicative: applicative functors">applicative functors</a>:</p>

<pre><code>infixl 4 &lt;*&gt;
class Functor f =&gt; Applicative f where
  pure  :: a -&gt; f a
  (&lt;*&gt;) :: f (a-&gt;b) -&gt; f a -&gt; f b
</code></pre>

<p>where, e.g., <code>f == Behavior</code>.</p>

<p>From these two methods, all of the n-ary lifting functions follow.
For instance,</p>

<pre><code>liftA3 :: Applicative f =&gt;
          (  a -&gt;   b -&gt;   c -&gt;   d)
       -&gt;  f a -&gt; f b -&gt; f c -&gt; f d
liftA3 h fa fb fc = pure h &lt;*&gt; fa &lt;*&gt; fb &lt;*&gt; fc
</code></pre>

<p>Or use <code>fmap h fa</code> in place of <code>pure h &lt;*&gt; fa</code>.
For prettier code, <code>(&lt;$&gt;)</code> (left infix) is synonymous with <code>fmap</code>.</p>

<p>Now, what about semantics?
Applying a dynamic function <code>fb</code> to a dynamic argument <code>xb</code> gives a dynamic result, whose value at time <code>t</code> is the value of <code>fb</code> at <code>t</code>, applied to the value of <code>xb</code> at <code>t</code>.</p>

<pre><code>at (fb &lt;*&gt; xb) ==  t -&gt; (fb `at` t) (xb `at` t)
</code></pre>

<p>The <code>(&lt;*&gt;)</code> operator is the heart of FRP&#8217;s concurrency model, which is determinate, synchronous, and continuous.</p>

<p>Promoting a static value yields a constant behavior:</p>

<pre><code>at (pure a) ==  t -&gt; a
            == const a
</code></pre>

<p>As with <code>Functor</code>, let&#8217;s look at the <code>Applicative</code> instance of functions (the meaning of behaviors):</p>

<pre><code>instance Applicative ((-&gt;) t) where
  pure a    = const a
  hf &lt;*&gt; xf =  t -&gt; (hf t) (xf t)
</code></pre>

<p>Wow &#8212; these two definitions look a lot like the meanings given above for <code>pure</code> and <code>(&lt;*&gt;)</code> on behaviors.
And sure enough, we can use the function instance to simplify these semantic definitions:</p>

<pre><code>at (pure a)    == pure a
at (fb &lt;*&gt; xb) == at fb &lt;*&gt; at xb
</code></pre>

<p>Thus the semantic function distributes over the <code>Applicative</code> methods.
In other words, the meaning of each method is the method on the meaning.
I don&#8217;t know of any standard term (like &#8220;natural transformation&#8221;) for this relationship between <code>at</code> and <code>pure</code>/<code>(&lt;*&gt;)</code>.
I suggest calling <code>at</code> an &#8220;applicative functor morphism&#8221;.</p>

<h4>Monad</h4>

<p>Monad morphisms are a bit trickier, due to the types.
There are two equivalent forms of the definition of a monad morphism, depending on whether you use <code>join</code> or <code>(&gt;&gt;=)</code>.
In the <code>join</code> form (e.g., in <a href="http://citeseer.ist.psu.edu/wadler92comprehending.html" title="Paper: &quot;Comprehending Monads&quot;">Comprehending Monads</a>, section 6), for monads <code>m</code> and <code>n</code>, the function <code>nu :: forall a. m a -&gt; n a</code> is a monad morphism if</p>

<pre><code>nu . join == join . nu . fmap nu
</code></pre>

<p>where</p>

<pre><code>join :: Monad m =&gt; m (m a) -&gt; m a
</code></pre>

<p>For behavior semantics, <code>m == Behavior</code>, <code>n == B == (-&gt;) Time</code>, and <code>nu == at</code>.</p>

<p>Then <code>at</code> is also a monad morphism if</p>

<pre><code>at (return a) == return a
at (join bb)  == join (at (fmap at bb))
</code></pre>

<p>And, since for functions <code>f</code>,</p>

<pre><code>fmap h f == h . f
join f   ==  t -&gt; f t t
</code></pre>

<p>the second condition is</p>

<pre><code>at (join bb) == join (at (fmap at bb))
             ==  t -&gt; at (at . bb) t t
             ==  t -&gt; at (at bb t) t
             ==  t -&gt; (bb `at` t) `at` t
</code></pre>

<p>So sampling <code>join bb</code> at <code>t</code> means sampling <code>bb</code> at <code>t</code> to get a behavior <code>b</code>, which is also sampled at <code>t</code>.
That&#8217;s exactly what I&#8217;d guess <code>join</code> to mean on behaviors.</p>

<p><em>Note:</em> the FRP implementation described in <em><a href="http://conal.net/blog/posts/simply-efficient-functional-reactivity/" title="Blog post: &quot;Simply efficient functional reactivity&quot;">Simply efficient functional reactivity</a></em> <em>does not</em> include a <code>Monad</code> instance for <code>Behavior</code>, because I don&#8217;t see how to implement one with the hybrid data-/demand-driven <code>Behavior</code> implementation.
However, the closely related but less expressive type, <code>Reactive</code>, has the same semantic model as <code>Behavior</code>.  <code>Reactive</code> does have a Monad instance, and its semantic function (<code>rats</code>) <em>is</em> a monad morphism.</p>

<h4>Other examples</h4>

<p><a href="http://conal.net/blog/posts/simply-efficient-functional-reactivity/" title="Blog post: &quot;Simply efficient functional reactivity&quot;">The <em>Simply</em> paper</a> contains several more examples of type class morphisms:</p>

<ul>
<li><a href="http://conal.net/blog/posts/reactive-values-from-the-future/" title="Blog post: &quot;Reactive values from the future&quot;">Reactive values</a>, time functions, and <a href="http://conal.net/blog/posts/future-values/" title="Blog post: &quot;Future values&quot;">future values</a> are also morphisms on <code>Functor</code>, <code>Applicative</code>, and <code>Monad</code>.</li>
<li><em>Improving values</em> are morphisms on <code>Ord</code>.</li>
</ul>

<p>The paper also includes a significant <em>non-example</em>, namely events.
The semantics I gave for <code>Event a</code> is a time-ordered list of time/value pairs.  However, the semantic function (<code>occs</code>) <em>is not</em> a <code>Monoid</code> morphism, because</p>

<pre><code>occs (e `mappend` e') == occs e `merge` occs e'
</code></pre>

<p>and <code>merge</code> is not <code>(++)</code>, which is <code>mappend</code> on lists.</p>

<h4>Why care about type class morphisms?</h4>

<p>I want my library&#8217;s users to think of behaviors and future values as being their semantic models (functions of time and time/value pairs).
Why?
Because these denotational models are simple and precise and have simple and useful formal properties.
Those properties allow library users to program with confidence, and allow library providers to make radical changes in representation and implementation (even from demand-driven to data-driven) without breaking client programs.</p>

<p>When I think of a behavior as a function of time, I&#8217;d like it to act like a function of time, hence <code>Functor</code>, <code>Applicative</code>, and <code>Monad</code>.
And if it does implement any classes in common with functions, then it had better agree the function instances of those classes.
Otherwise, user expectations will be mistaken, and the illusion is broken.</p>

<p>I&#8217;d love to hear about other examples of type class morphisms, particularly for <code>Applicative</code> and <code>Monad</code>, as well as thoughts on their usefulness.</p>
<p><a href="http://conal.net/blog/?flattrss_redirect&amp;id=23&amp;md5=1da841eb36a6131cb61d906442f69326"><img src="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png" srcset="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@2x.png 2xhttp://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@3x.png 3x" alt="Flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://conal.net/blog/posts/simplifying-semantics-with-type-class-morphisms/feed</wfw:commentRss>
		<slash:comments>18</slash:comments>
		<atom:link rel="payment" title="Flattr this!" href="https://flattr.com/submit/auto?user_id=conal&amp;popout=1&amp;url=http%3A%2F%2Fconal.net%2Fblog%2Fposts%2Fsimplifying-semantics-with-type-class-morphisms&amp;language=en_GB&amp;category=text&amp;title=Simplifying+semantics+with+type+class+morphisms&amp;description=When+I+first+started+playing+with+functional+reactivity+in+Fran+and+its+predecessors%2C+I+didn%26%238217%3Bt+realize+that+much+of+the+functionality+of+events+and+reactive+behaviors+could+be+packaged+via...&amp;tags=applicative+functor%2CFRP%2Cfunctional+reactive+programming%2Cfunctor%2Cmonad%2Cmonoid%2Csemantics%2Ctype+class%2Ctype+class+morphism%2Cblog" type="text/html" />
	</item>
		<item>
		<title>Simply efficient functional reactivity</title>
		<link>http://conal.net/blog/posts/simply-efficient-functional-reactivity</link>
		<comments>http://conal.net/blog/posts/simply-efficient-functional-reactivity#comments</comments>
		<pubDate>Fri, 04 Apr 2008 22:27:43 +0000</pubDate>
		<dc:creator><![CDATA[Conal]]></dc:creator>
				<category><![CDATA[Functional programming]]></category>
		<category><![CDATA[applicative functor]]></category>
		<category><![CDATA[continuous]]></category>
		<category><![CDATA[discrete]]></category>
		<category><![CDATA[events]]></category>
		<category><![CDATA[FRP]]></category>
		<category><![CDATA[functional reactive programming]]></category>
		<category><![CDATA[functor]]></category>
		<category><![CDATA[future value]]></category>
		<category><![CDATA[icfp]]></category>
		<category><![CDATA[implementation]]></category>
		<category><![CDATA[joinMaybes]]></category>
		<category><![CDATA[monad]]></category>
		<category><![CDATA[monoid]]></category>
		<category><![CDATA[multi-threading]]></category>
		<category><![CDATA[normal form]]></category>
		<category><![CDATA[paper]]></category>
		<category><![CDATA[reactive behavior]]></category>
		<category><![CDATA[reactive value]]></category>
		<category><![CDATA[semantics]]></category>
		<category><![CDATA[time]]></category>
		<category><![CDATA[type class]]></category>
		<category><![CDATA[type class morphism]]></category>
		<category><![CDATA[type composition]]></category>

		<guid isPermaLink="false">http://conal.net/blog/posts/simply-efficient-functional-reactivity/</guid>
		<description><![CDATA[I submitted a paper Simply efficient functional reactivity to ICFP 2008. Abstract: Functional reactive programming (FRP) has simple and powerful semantics, but has resisted efficient implementation. In particular, most past implementations have used demand-driven sampling, which accommodates FRP&#8217;s continuous time semantics and fits well with the nature of functional programming. Consequently, values are wastefully recomputed [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- 

Title: Simply efficient functional reactivity

Tags: applicative functor, continuous, discrete, event, FRP, functional reactive programming, functor, future value, icfp, implementation, joinMaybes, monad, monoid, type class morphism, multi-threading, normal form, paper, reactive behavior, reactive value, semantics, time, type class, type composition

URL: http://conal.net/blog/posts/simply-efficient-functional-reactivity/

-->

<!-- references -->

<!-- teaser -->

<p>I submitted a paper <em><a href="http://conal.net/papers/simply-reactive" title="Paper: &quot;Simply efficient functional reactivity&quot;">Simply efficient functional reactivity</a></em> to <a href="http://www.icfpconference.org/icfp2008" title="ICFP 2008 conference page">ICFP 2008</a>.</p>

<p><strong>Abstract:</strong></p>

<blockquote>
  <p>Functional reactive programming (FRP) has simple and powerful semantics, but has resisted efficient implementation.  In particular, most past implementations have used demand-driven sampling, which accommodates FRP&#8217;s continuous time semantics and fits well with the nature of functional programming.  Consequently, values are wastefully recomputed even when inputs don&#8217;t change, and reaction latency can be as high as the sampling period.</p>
  
  <p>This paper presents a way to implement FRP that combines data- and demand-driven evaluation, in which values are recomputed only when necessary, and reactions are nearly instantaneous.  The implementation is rooted in a new simple formulation of FRP and its semantics and so is easy to understand and reason about.</p>
  
  <p>On the road to efficiency and simplicity, we&#8217;ll meet some old friends (monoids, functors, applicative functors, monads, morphisms, and improving values) and make some new friends (functional future values, reactive normal form, and concurrent &#8220;unambiguous choice&#8221;).</p>
</blockquote>

<!--
**Edits**:

* 2008-02-09: just fiddling around
-->

<!-- without a comment or something here, the last item above becomes a paragraph -->
<p><a href="http://conal.net/blog/?flattrss_redirect&amp;id=22&amp;md5=dee648d38383131c59963d5b9b4c4b93"><img src="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png" srcset="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@2x.png 2xhttp://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@3x.png 3x" alt="Flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://conal.net/blog/posts/simply-efficient-functional-reactivity/feed</wfw:commentRss>
		<slash:comments>33</slash:comments>
		<atom:link rel="payment" title="Flattr this!" href="https://flattr.com/submit/auto?user_id=conal&amp;popout=1&amp;url=http%3A%2F%2Fconal.net%2Fblog%2Fposts%2Fsimply-efficient-functional-reactivity&amp;language=en_GB&amp;category=text&amp;title=Simply+efficient+functional+reactivity&amp;description=I+submitted+a+paper+Simply+efficient+functional+reactivity+to+ICFP+2008.+Abstract%3A+Functional+reactive+programming+%28FRP%29+has+simple+and+powerful+semantics%2C+but+has+resisted+efficient+implementation.+In+particular%2C+most+past...&amp;tags=applicative+functor%2Ccontinuous%2Cdiscrete%2Cevents%2CFRP%2Cfunctional+reactive+programming%2Cfunctor%2Cfuture+value%2Cicfp%2Cimplementation%2CjoinMaybes%2Cmonad%2Cmonoid%2Cmulti-threading%2Cnormal+form%2Cpaper%2Creactive+behavior%2Creactive+value%2Csemantics%2Ctime%2Ctype+class%2Ctype+class+morphism%2Ctype+composition%2Cblog" type="text/html" />
	</item>
		<item>
		<title>Future values</title>
		<link>http://conal.net/blog/posts/future-values</link>
		<comments>http://conal.net/blog/posts/future-values#comments</comments>
		<pubDate>Wed, 16 Jan 2008 01:31:00 +0000</pubDate>
		<dc:creator><![CDATA[Conal]]></dc:creator>
				<category><![CDATA[Functional programming]]></category>
		<category><![CDATA[applicative functor]]></category>
		<category><![CDATA[future value]]></category>
		<category><![CDATA[monad]]></category>
		<category><![CDATA[monoid]]></category>
		<category><![CDATA[semantics]]></category>
		<category><![CDATA[type class]]></category>

		<guid isPermaLink="false">http://conal.net/blog/posts/future-values-part-one-semantics/</guid>
		<description><![CDATA[A future value (or simply &#8220;future&#8221;) is a value that might not be knowable until a later time, such as &#8220;the value of the next key you press&#8221;, or &#8220;the value of LambdaPix stock at noon next Monday&#8221; (both from the time you first read this sentence), or &#8220;how many tries it will take me [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- 

Title: Future values

Tags: applicative functors, future values, monads, monoids, semantics, reactivity

-->

<!-- References -->

<!-- teaser -->

<p>A <em>future value</em> (or simply &#8220;future&#8221;) is a value that might not be knowable until a later time, such as &#8220;the value of the next key you press&#8221;, or &#8220;the value of LambdaPix stock at noon next Monday&#8221; (both from the time you first read this sentence), or &#8220;how many tries it will take me to blow out all the candles on my next birthday cake&#8221;.  Unlike an imperative computation, each future has a unique value &#8212; although you probably cannot yet know what that value is.  I&#8217;ve implemented this notion of futures as part of a library <a href="http://haskell.org/haskellwiki/Reactive" title="Reactive">Reactive</a>.</p>

<p><strong>Edits</strong>:</p>

<ul>
<li>2008-04-04: tweaked tag; removed first section heading.</li>
</ul>

<!-- without a comment or something here, the last item above becomes a paragraph -->

<p><span id="more-7"></span></p>

<p>You can <em>force</em> a future, which makes you wait (block) until its value is knowable.  Meanwhile, what kinds of things can you do a future <em>now</em>?</p>

<ul>
<li>Apply a function to the not-yet-known value, resulting in another future.  For instance, suppose <code>fc :: Future Char</code> is the first character you type after a specific time.  Then <code>fmap toUpper fc :: Future Char</code> is the capitalized version of the future character.  Thus, <code>Future</code> is a functor.  The resulting future is knowable when <code>fc</code> is knowable.</li>
<li>What about combining two or more future values?  For instance, how many days between the first time after the start of 2008 that the temperature exceeds 80 degrees Fahrenheit at (a) my home and (b) your home.  Each of those dates is a future value, and so is the difference between them.  If those futures are <code>m80, y80 :: Day</code>, then the difference is <code>diff80 = liftA2 (-) m80 y80</code>.  That difference is becomes knowable when the <em>later</em> of the <code>m80</code> and <code>y80</code> becomes knowable.  So <code>Future</code> is an applicative functor (AF), and one can apply a future function to a future argument to get a future result (<code>futRes = futFun &lt;*&gt; futArg</code>).  The other AF method is <code>pure :: a -&gt; Future a</code>, which makes a future value that is always knowable to be a given value.</li>
<li>Sometimes questions about the future are staged, such as &#8220;What will be the price of milk the day after it the temperature next drops below freezing&#8221; (plus specifics about where and starting when).  Suppose <code>priceOn :: Day -&gt; Future Price</code> gives the price of milk on a given day (at some specified place), and <code>nextFreeze :: Day -&gt; Future Day</code> is the first date of a freeze (also at a specified place) after a given date.  Then our query is expressed as <code>nextFreeze today &gt;&gt;= priceOn</code>, which has type <code>Future Price</code>.  <code>Future</code> is thus a monad.  (The <code>return</code> method of a monad is the same as the <code>pure</code> method of an AF.)  From another perspective on monads, we can collapse a future future into a future, using <code>join :: Future (Future a) -&gt; Future a</code>.</li>
</ul>

<p>These three ways of manipulating futures are all focused on the value of futures.  There is one more, very useful, combining operation that focuses on the <em>timing</em> of futures: given two futures, which one comes first.  Although we can&#8217;t know the answer now, we can ask the question now and get a future.  For example, what is the next character that either you or I will type?  Call those characters <code>mc, yc :: Future Char</code>.  The earlier of the two is <code>mc `mappend` yc</code>, which has type <code>Future Char</code>.  Thus, <code>Future ty</code> is a monoid for every type <code>ty</code>.  The other monoid method is <code>mempty</code> (the identity for <code>mappend</code>), which is the future that never happens.</p>

<h3>Why aren&#8217;t futures just lazy values?</h3>

<p>If futures were just lazy values, then we wouldn&#8217;t have to use <code>pure</code>, <code>fmap</code>, <code>(&lt;*&gt;)</code> (and <code>liftA</code>_n_), and <code>(&gt;&gt;=)</code>.  However, there isn&#8217;t enough semantic content in a plain-old-value to determine which of two values is <em>earlier</em> (<code>mappend</code> on futures).</p>

<h2>A semantics for futures</h2>

<p>To clarify my thinking about future values, I&#8217;d like to have a simple and precise denotational semantics and then an implementation that is faithful to the semantics.  The module <code>Data.SFuture</code> provides such a semantics, although the implementation in <code>Data.Future</code> is not completely faithful.</p>

<h3>The model</h3>

<p>The semantic model is very simple: (the meaning of) a future value is just a time/value pair.  The particular choice of &#8220;time&#8221; type is not important, as long as it is ordered.</p>

<pre><code>newtype Future t a = Future (Time t, a)
  deriving (Functor, Applicative, Monad, Show)
</code></pre>

<p>Delightfully, almost all required functionality comes automatically from the derived class instances, thanks to the standard instances for pairs and the definition of <code>Time</code>, given below.  Rather than require our time type to be bounded, we can easily add bounds to an arbitrary type.  Rather than defining <code>Time t</code> now, let&#8217;s discover the definition while considering the required meanings of the class instances.  The definition will use just a bit of wrapping around the type <code>t</code>, demonstrating a principle Conor McBride <a href="http://article.gmane.org/gmane.comp.lang.haskell.cafe/26520">expressed</a> as &#8220;types don&#8217;t just contain data, types explain data&#8221;.</p>

<h3>Functor</h3>

<p>The <code>Functor</code> instance is provided entirely by the standard instance for pairs:</p>

<pre><code>instance Functor ((,) a) where fmap f (a,b) = (a, f b)
</code></pre>

<p>In particular, <code>fmap f (Future (t,b)) == Future t (f b)</code>, as desired.</p>

<h3>Applicative and Time</h3>

<p>Look next at the <code>Applicative</code> instance for pairs:</p>

<pre><code>instance Monoid a =&gt; Applicative ((,) a) where
  pure x = (mempty, x)
  (u, f) &lt;*&gt; (v, x) = (u `mappend` v, f x)
</code></pre>

<p>So <code>Time t</code> must be a monoid, with <code>mempty</code> being the earliest time and <code>mappend</code> being <code>max</code>.  We&#8217;ll define <code>Time</code> with the help of the <code>Max</code> monoid:</p>

<pre><code>newtype Max a = Max { getMax :: a }
  deriving (Eq, Ord, Read, Show, Bounded)

instance (Ord a, Bounded a) =&gt; Monoid (Max a) where
  mempty = Max minBound
  Max a `mappend` Max b = Max (a `max` b)
</code></pre>

<p>We could require that the underlying time parameter type <code>t</code> be <code>Bounded</code>, but I want to have as few restrictions as possible.  For instance, <code>Integer</code>, <code>Float</code>, and <code>Double</code> are not <code>Bounded</code>, and neither are the types in the <code>Time</code> library.  Fortunately, it&#8217;s easy to add bounds to any type, preserving the existing ordering.</p>

<pre><code>data AddBounds a = MinBound | NoBound a | MaxBound
  deriving (Eq, Ord, Read, Show)

instance Bounded (AddBounds a) where
  minBound = MinBound
  maxBound = MaxBound
</code></pre>

<p>With these two reusable building blocks, our <code>Time</code> definition falls right out:</p>

<pre><code>type Time t = Max (AddBounds t)
</code></pre>

<h3>Monad</h3>

<p>For our <code>Monad</code> instance, we just need an instance for pairs equivalent to the Monad Writer instance.</p>

<pre><code>instance Monoid o =&gt; Monad ((,) o) where
  return = pure
  (o,a) &gt;&gt;= f = (o `mappend` o', a') where (o',a') = f a
</code></pre>

<p>Consequently (using <code>join m = m &gt;&gt;= id</code>), <code>join ((o, (o',a))) == (o `mappend` o', a)</code>.  Again, the standard instance implies exactly the desired meaning for futures.  <code>Future (t,a) &gt;&gt;= f</code> is available exactly at the later of <code>t</code> and the availability of <code>f a</code>.  We might have guessed instead that the time is simply the time of <code>f a</code>, e.g., assuming it to always be at least <code>t</code>.  However, <code>f a</code> could result from <code>pure</code> and so have time <code>minBound</code>.</p>

<h3>Monoid</h3>

<p>The last piece of <code>Future</code> functionality is the <code>Monoid</code> instance, and I don&#8217;t know how to get that instance to define itself.  I want <code>mappend</code> to yield the <em>earlier</em> of two futures, choosing the first argument when simultaneous.  The never-occuring <code>mempty</code> has a time beyond all <code>t</code> values.</p>

<pre><code>instance Ord t =&gt; Monoid (Future t a) where
  mempty  = Future (maxBound, error "it'll never happen, buddy")
  fut@(Future (t,_)) `mappend` fut'@(Future (t',_)) =
    if t &lt;= t' then fut else fut'
</code></pre>

<h2>Coming next</h2>

<p>Tune in for the <a href="http://conal.net/blog/posts/future-values-via-multi-threading/" title="Blog post: &quot;Future values via multi-threading&quot;">next post</a>, which describes the current implementation of future values in <a href="http://haskell.org/haskellwiki/Reactive" title="Reactive">Reactive</a>.  The implementation uses multi-threading and is not quite faithful to the semantics given here.  I&#8217;m looking for a faithful implementation.</p>

<p>A following post will then describe the use of future values in an elegant new implementation of functional reactive programming.</p>
<p><a href="http://conal.net/blog/?flattrss_redirect&amp;id=7&amp;md5=5180f9440be8fbe4e2084af47d4c3fd3"><img src="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png" srcset="http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@2x.png 2xhttp://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white.png, http://conal.net/blog/wp-content/plugins/flattr/img/flattr-badge-white@3x.png 3x" alt="Flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://conal.net/blog/posts/future-values/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		<atom:link rel="payment" title="Flattr this!" href="https://flattr.com/submit/auto?user_id=conal&amp;popout=1&amp;url=http%3A%2F%2Fconal.net%2Fblog%2Fposts%2Ffuture-values&amp;language=en_GB&amp;category=text&amp;title=Future+values&amp;description=A+future+value+%28or+simply+%26%238220%3Bfuture%26%238221%3B%29+is+a+value+that+might+not+be+knowable+until+a+later+time%2C+such+as+%26%238220%3Bthe+value+of+the+next+key+you+press%26%238221%3B%2C+or+%26%238220%3Bthe...&amp;tags=applicative+functor%2Cfuture+value%2Cmonad%2Cmonoid%2Csemantics%2Ctype+class%2Cblog" type="text/html" />
	</item>
	</channel>
</rss>
