## Comparing formulations of higher-dimensional, higher-order derivatives

I just reread Jason Foutz’s post Higher order multivariate automatic differentiation in Haskell, as I’m thinking about this topic again. I like his trick of using an `IntMap`

to hold the partial derivatives and (recursively) the partials of those partials, etc.

Some thoughts:

I bet one can eliminate the constant (

`C`

) case in Jason’s representation, and hence 3/4 of the cases to handle, without much loss in performance. He already has a fairly efficient representation of constants, which is a`D`

with an empty`IntMap`

.I imagine there’s also a nice generalization of the code for combining two finite maps used in his third multiply case. The code’s meaning and correctness follows from a model for those maps as total functions with missing elements denoting a default value (zero in this case).

Jason’s data type reminds me of a sparse matrix representation, but cooler in how it’s infinitely nested. Perhaps depth

*n*(starting with zero) is a sparse*n-dimensional*matrix.Finally, I suspect there’s a close connection between Jason’s

`IntMap`

-based implementation and my`LinearMap`

-based implementation described in*Higher-dimensional, higher-order derivatives, functionally*and in*Simpler, more efficient, functional linear maps*. For the case of*R*, my formulation uses a trie with entries for^{n}*n*basis elements, while Jason’s uses an`IntMap`

(which is also a trie) with*n*entries (counting any implicit zeros).

I suspect Jason’s formulation is more efficient (since it optimizes the constant case), while mine is more statically typed and more flexible (since it handles more than *R ^{n}*).

For optimizing constants, I think I’d prefer having a single constructor with a `Maybe`

for the derivatives, to eliminate code duplication.

I am still trying to understand the paper *Lazy Multivariate Higher-Order Forward-Mode AD*, with its management of various epsilons.

A final remark: I prefer the term “higher-dimensional” over the traditional “multivariate”. I hear classic syntax/semantics confusion in the latter.