29th July 2010, 09:06 am

The zipper is an efficient and elegant data structure for purely functional editing of tree-like data structures, first published by Gérard Huet.
Zippers maintain a location of focus in a tree and support navigation operations (up, down, left, right) and editing (replace current focus).

The original zipper type and operations are customized for a single type, but it’s not hard to see how to adapt to other tree-like types, and hence to regular data types.
There have been many follow-up papers to *The Zipper*, including a polytypic version in the paper *Type-indexed data types*.

All of the zipper adaptations and generalizations I’ve seen so far maintain the original navigation interface.
In this post, I propose an alternative interface that appears to significantly simplify matters.
There are only two navigation functions instead of four, and each of the two is specified and implemented via a fairly simple one-liner.

I haven’t used this new zipper formulation in an application yet, so I do not know whether some usefulness has been lost in simplifying the interface.

The code in this blog post is taken from the Haskell library functor-combo and completes the `Holey`

type class introduced in *Differentiation of higher-order types*.

**Edits**:

- 2010-07-29: Removed some stray
`Just`

applications in `up`

definitions. (Thanks, illissius.)
- 2010-07-29: Augmented my complicated definition of
`tweak2`

with a much simpler version from Sjoerd Visscher.
- 2010-07-29: Replaced
`fmap (first (:ds'))`

with `(fmap.first) (:ds')`

in `down`

definitions. (Thanks, Sjoerd.)

Continue reading ‘Another angle on zippers’ »

: http://www.st.cs.uni-saarland.de/edu/seminare/2005/advanced-fp/docs/huet-zipper.pdf "paper by Gérard Huet"
: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.1.6342 "paper by Ralf Hinze, Johan Jeuring, and Andres Löh"
: http://hackage.haskell.org/package/functor-combo "Hackage entry: functor-combo"
: http://conal.net/blog/posts/differentiation-of-higher-order-types/ "blog post"
: http://conal.net/blog/posts/semantic-editor-combinators/ "blog post"
: http://conal.net/blog/posts/prettier-functions-for-wrapping-and-wrapping/ "blog post"
: http://matt.immute.net/content/pointless-fun "blog post by Matt Hellige"
: http://hackage.haskell.org/packages/archive/TypeCompose/latest/doc/html/Control-Compose.html "module documentation"
[Elegant memoization with ...

28th July 2010, 06:45 pm

A “one-hole context” is a data structure with one piece missing.
Conor McBride pointed out that the derivative of a regular type is its type of one-hole contexts.
When a data structure is assembled out of common functor combinators, a corresponding type of one-hole contexts can be derived mechanically by rules that mirror the standard derivative rules learned in beginning differential calculus.

I’ve been playing with functor combinators lately.
I was delighted to find that the data-structure derivatives can be expressed directly using the standard functor combinators and type families.

The code in this blog post is taken from the Haskell library functor-combo.

See also the Haskell Wikibooks page on zippers, especially the section called “Differentiation of data types”.

I mean this post not as new research, but rather as a tidy, concrete presentation of some of Conor’s delightful insight.

Continue reading ‘Differentiation of higher-order types’ »

: http://www.cs.nott.ac.uk/~ctm/diff.pdf "paper by Conor McBride"
: http://www.comlab.ox.ac.uk/people/ralf.hinze/talks/Ameland.pdf "Slides by Ralf Hinze"
: http://conal.net/blog/posts/elegant-memoization-with-higher-order-types/ "blog post"
: http://conal.net/blog/posts/memoizing-higher-order-functions/ "blog post"
: http://hackage.haskell.org/package/functor-combo "Hackage entry: functor-combo"
: http://www.haskell.org/ghc/docs/latest/html/users_guide/type-families.html "GHC documentation on type families"
: http://conal.net/blog/posts/paper-beautiful-differentiation/ "blog ...

4th January 2010, 01:55 pm

Ever since ActiveVRML, the model we’ve been using in functional reactive programming (FRP) for interactive behaviors is `(T->a) -> (T->b)`

, for dynamic (time-varying) input of type `a`

and dynamic output of type `b`

(where `T`

is time).
In “Classic FRP” formulations (including ActiveVRML, Fran & Reactive), there is a “behavior” abstraction whose denotation is a function of time.
Interactive behaviors are then modeled as host language (e.g., Haskell) functions between behaviors.
Problems with this formulation are described in *Why classic FRP does not fit interactive behavior*.
These same problems motivated “Arrowized FRP”.
In Arrowized FRP, behaviors (renamed “signals”) are purely conceptual.
They are part of the semantic model but do not have any realization in the programming interface.
Instead, the abstraction is a *signal transformer*, `SF a b`

, whose semantics is `(T->a) -> (T->b)`

.
See *Genuinely Functional User Interfaces* and *Functional Reactive Programming, Continued*.

Whether in its classic or arrowized embodiment, I’ve been growing uncomfortable with this semantic model of functions between time functions.
A few weeks ago, I realized that one source of discomfort is that this model is *mostly junk*.

This post contains some partially formed thoughts about how to eliminate the junk (“garbage collect the semantics”), and what might remain.

Continue reading ‘Garbage collecting the semantics of FRP’ »

: http://conal.net/papers/ActiveVRML/ "Tech report: "A Brief Introduction to ActiveVRML""
: http://conal.net/papers/icfp97/ "paper"
: http://conal.net/papers/push-pull-frp/ "Paper by Conal Elliott and Paul Hudak"
: http://conal.net/papers/genuinely-functional-guis.pdf "Paper by Antony Courtney and Conal Elliott"
: http://www.haskell.org/yale/papers/haskellworkshop02/ "Paper by Henrik Nilsson, Antony Courtney, and John Peterson"
[Why classic FRP does not fit interactive ...

24th February 2009, 12:05 am

I have another paper draft for submission to ICFP 2009.
This one is called *Beautiful differentiation*,
The paper is a culmination of the several posts I’ve written on derivatives and automatic differentiation (AD).
I’m happy with how the derivation keeps getting simpler.
Now I’ve boiled extremely general higher-order AD down to a `Functor`

and `Applicative`

morphism.

I’d love to get some readings and feedback.
I’m a bit over the page the limit, so I’ll have to do some trimming before submitting.

The abstract:

Automatic differentiation (AD) is a precise, efficient, and convenient
method for computing derivatives of functions. Its implementation can be
quite simple even when extended to compute all of the higher-order
derivatives as well. The higher-dimensional case has also been tackled,
though with extra complexity. This paper develops an implementation of
higher-dimensional, higher-order differentiation in the extremely
general and elegant setting of *calculus on manifolds* and derives that
implementation from a simple and precise specification.

In order to motivate and discover the implementation, the paper poses
the question “What does AD mean, independently of implementation?” An
answer arises in the form of *naturality* of sampling a function and its
derivative. Automatic differentiation flows out of this naturality
condition, together with the chain rule. Graduating from first-order to
higher-order AD corresponds to sampling all derivatives instead of just
one. Next, the notion of a derivative is generalized via the notions of
vector space and linear maps. The specification of AD adapts to this
elegant and very general setting, which even *simplifies* the
development.

You can get the paper and see current errata here.

The submission deadline is March 2, so comments before then are most helpful to me.

Enjoy, and thanks!

: http://conal.net/papers/beautiful-differentiation "paper"
: http://www.cs.nott.ac.uk/~gmh/icfp09.html "conference page"
: http://www.cs.pdx.edu/~apt/icfp09_cfp.html "ICFP 2009 call for papers"
I have another paper draft for submission to .
This one is called **,
The paper is a culmination of the (http://conal.net/blog/tag/derivative/) I've written on derivatives and automatic differentiation (AD).
I'm happy with how the derivation keeps getting ...

28th January 2009, 12:09 pm

Bertrand Russell remarked that

*Everything is vague to a degree you do not realize till you have tried to make it precise.*

I’m mulling over automatic differentiation (AD) again, neatening up previous posts on derivatives and on linear maps, working them into a coherent whole for an ICFP submission. I understand the mechanics and some of the reasons for its correctness. After all, it’s "just the chain rule".

As usual, in the process of writing, I bumped up against Russell’s principle. I felt a growing uneasiness and realized that I didn’t understand AD in the way I like to understand software, namely,

*What* does it mean, independently of implementation?*How* do the implementation and its correctness flow gracefully from that meaning?*Where* else might we go, guided by answers to the first two questions?

Ever since writing *Simply efficient functional reactivity*, the idea of type class morphisms keeps popping up for me as a framework in which to ask and answer these questions. To my delight, this framework gives me new and more satisfying insight into automatic differentiation.

Continue reading ‘What is automatic differentiation, and why does it work?’ »

Bertrand Russell remarked thatEverything is vague to a degree you do not realize till you have tried to make it precise.I'm mulling over automatic differentiation (AD) again, neatening up previous posts on derivatives and on linear maps, working them into a coherent whole for an ICFP submission. I understand the mechanics and some of ...

24th January 2009, 05:40 pm

I just reread Jason Foutz’s post Higher order multivariate automatic differentiation in Haskell, as I’m thinking about this topic again. I like his trick of using an `IntMap`

to hold the partial derivatives and (recursively) the partials of those partials, etc.

Some thoughts:

I bet one can eliminate the constant (`C`

) case in Jason’s representation, and hence 3/4 of the cases to handle, without much loss in performance. He already has a fairly efficient representation of constants, which is a `D`

with an empty `IntMap`

.

I imagine there’s also a nice generalization of the code for combining two finite maps used in his third multiply case. The code’s meaning and correctness follows from a model for those maps as total functions with missing elements denoting a default value (zero in this case).

Jason’s data type reminds me of a sparse matrix representation, but cooler in how it’s infinitely nested. Perhaps depth *n* (starting with zero) is a sparse *n-dimensional* matrix.

Finally, I suspect there’s a close connection between Jason’s `IntMap`

-based implementation and my `LinearMap`

-based implementation described in *Higher-dimensional, higher-order derivatives, functionally* and in *Simpler, more efficient, functional linear maps*. For the case of *R*^{n}, my formulation uses a trie with entries for *n* basis elements, while Jason’s uses an `IntMap`

(which is also a trie) with *n* entries (counting any implicit zeros).

I suspect Jason’s formulation is more efficient (since it optimizes the constant case), while mine is more statically typed and more flexible (since it handles more than *R*^{n}).

For optimizing constants, I think I’d prefer having a single constructor with a `Maybe`

for the derivatives, to eliminate code duplication.

I am still trying to understand the paper *Lazy Multivariate Higher-Order Forward-Mode AD*, with its management of various epsilons.

A final remark: I prefer the term “higher-dimensional” over the traditional “multivariate”.
I hear classic syntax/semantics confusion in the latter.

: http://conal.net/blog/posts/higher-dimensional-higher-order-derivatives-functionally/ "blog post"
: http://conal.net/blog/posts/simpler-more-efficient-functional-linear-maps/ "blog post"
: http://www.bcl.hamilton.ie/~qobi/tower/papers/popl2007a.pdf "paper by Barak Pearlmutter and Jeffrey Siskind"
I just reread Jason Foutz's post (http://metavar.blogspot.com/2008/02/higher-order-multivariate-automatic.html), as I'm thinking about this topic again. I like his trick of using an ...

3rd June 2008, 09:49 pm

Two earlier posts described a simple and general notion of *derivative* that unifies the many concrete notions taught in traditional calculus courses.
All of those variations turn out to be concrete representations of the single abstract notion of a *linear map*.
Correspondingly, the various forms of mulitplication in chain rules all turn out to be implementations of *composition* of linear maps.
For simplicity, I suggested a direct implementation of linear maps as functions.
Unfortunately, that direct representation thwarts efficiency, since functions, unlike data structures, do not cache by default.

This post presents a *data* representation of linear maps that makes crucial use of (a) linearity and (b) the recently added language feature *indexed type families* (“associated types”).

For a while now, I’ve wondered if a library for linear maps could replace and generalize matrix libraries.
After all, matrices represent of a restricted class of linear maps.
Unlike conventional matrix libraries, however, the linear map library described in this post captures matrix/linear-map dimensions via *static typing*.
The composition function defined below statically enforces the conformability property required of matrix multiplication (which implements linear map composition).
Likewise, conformance for addition of linear maps is also enforced simply and statically.
Moreover, with sufficiently sophisticated coaxing of the Haskell compiler, of the sort Don Stewart does, perhaps a library like this one could also have terrific performance. (It doesn’t yet.)

You can read and try out the code for this post in the module Data.LinearMap in version 0.2.0 or later of the vector-space package.
That module also contains an implementation of linear map composition, as well as `Functor`

-like and `Applicative`

-like operations.
Andy Gill has been helping me get to the bottom of some some severe performance problems, apparently involving huge amounts of redundant dictionary creation.

**Edits**:

- 2008-06-04: Brief explanation of the associated data type declaration.

Continue reading ‘Functional linear maps’ »

: http://conal.net/blog/posts/what-is-a-derivative-really/ "Blog post: "What is a derivative, really?""
: http://conal.net/blog/posts/higher-dimensional-higher-order-derivatives-functionally/ "Blog post: "Higher-dimensional, higher-order derivatives, functionally""
: http://haskell.org/haskellwiki/vector-space "Library wiki page: "vector-space""
: http://reddit.com/goto?id=6lx36 "Blog post: "Haskell as fast as C: working at a high ...