Higher-dimensional, higher-order derivatives, functionally

The post Beautiful differentiation showed some lovely code that makes it easy to compute not just the values of user-written functions, but also all of its derivatives (infinitely many). This elegant technique is limited, however, to functions over a scalar (one-dimensional) domain. Next, we explored what it means to transcend that limitation, asking and answering the question What is a derivative, really? The answer to that question is that derivative values are linear maps saying how small input changes result in output changes. This answer allows us to unify several different notions of derivatives and their corresponding chain rules into a single simple and powerful form.

This third post combines the ideas from the two previous posts, to easily compute infinitely many derivatives of functions over arbitrary-dimensional domains.

The code shown here is part of a new Haskell library, which you can download and play with or peruse on the web.

Continue reading ‘Higher-dimensional, higher-order derivatives, functionally’ »

What is a derivative, really?

The post Beautiful differentiation showed how easily and beautifully one can construct an infinite tower of derivative values in Haskell programs, while computing plain old values. The trick (from Jerzy Karczmarczuk) was to overload numeric operators to operate on the following (co)recursive type:

data Dif b = D b (Dif b)

This representation, however, works only when differentiating functions from a scalar (one-dimensional) domain, i.e., functions of type a -> b for a scalar type a. The reason for this limitation is that only in those cases can the type of derivative values be identified with the type of regular values.

Consider a function f :: (R,R) -> R, where R is, say, Double. The value of f at a domain value (x,y) has type R, but the derivative of f consists of two partial derivatives. Moreover, the second derivative consists of four partial second-order derivatives (or three, depending how you count). A function f :: (R,R) -> (R,R,R) also has two partial derivatives at each point (x,y), each of which is a triple. That pair of triples is commonly written as a two-by-three matrix.

Each of these situations has its own derivative shape and its own chain rule (for the derivative of function compositions), using plain-old multiplication, scalar-times-vector, vector-dot-vector, matrix-times-vector, or matrix-times-matrix. Second derivatives are more complex and varied.

How many forms of derivatives and chain rules are enough? Are we doomed to work with a plethora of increasingly complex types of derivatives, as well as the diverse chain rules needed to accommodate all compatible pairs of derivatives? Fortunately, not. There is a single, simple, unifying generalization. By reconsidering what we mean by a derivative value, we can see that these various forms are all representations of a single notion, and all the chain rules mean the same thing on the meanings of the representations.

This blog post is about that unifying view of derivatives.

Edits:

  • 2008-05-20: There are several comments about this post on reddit.
  • 2008-05-20: Renamed derivative operator from D to deriv to avoid confusion with the data constructor for derivative towers.
  • 2008-05-20: Renamed linear map type from (:->) to (:-*) to make it visually closer to a standard notation.

Continue reading ‘What is a derivative, really?’ »

Beautiful differentiation

Lately I’ve been playing again with parametric surfaces in Haskell. Surface rendering requires normals, which can be constructed from partial derivatives, which brings up automatic differentiation (AD). Playing with some refactoring, I’ve stumbled across a terser, lovelier formulation for the derivative rules than I’ve seen before.

Edits:

  • 2008-05-08: Added source files: NumInstances.hs and Dif.hs.
  • 2008-05-20: Changed some variable names for clarity and consistency. For instance, x@(D x0 x') instead of p@(D x x').
  • 2008-05-20: Removed extraneous Fractional constraint in the Floating instance of Dif.

Continue reading ‘Beautiful differentiation’ »