Functional linear maps

Two earlier posts described a simple and general notion of derivative that unifies the many concrete notions taught in traditional calculus courses. All of those variations turn out to be concrete representations of the single abstract notion of a linear map. Correspondingly, the various forms of mulitplication in chain rules all turn out to be implementations of composition of linear maps. For simplicity, I suggested a direct implementation of linear maps as functions. Unfortunately, that direct representation thwarts efficiency, since functions, unlike data structures, do not cache by default.

This post presents a data representation of linear maps that makes crucial use of (a) linearity and (b) the recently added language feature indexed type families (“associated types”).

For a while now, I’ve wondered if a library for linear maps could replace and generalize matrix libraries. After all, matrices represent of a restricted class of linear maps. Unlike conventional matrix libraries, however, the linear map library described in this post captures matrix/linear-map dimensions via static typing. The composition function defined below statically enforces the conformability property required of matrix multiplication (which implements linear map composition). Likewise, conformance for addition of linear maps is also enforced simply and statically. Moreover, with sufficiently sophisticated coaxing of the Haskell compiler, of the sort Don Stewart does, perhaps a library like this one could also have terrific performance. (It doesn’t yet.)

You can read and try out the code for this post in the module Data.LinearMap in version 0.2.0 or later of the vector-space package. That module also contains an implementation of linear map composition, as well as Functor-like and Applicative-like operations. Andy Gill has been helping me get to the bottom of some some severe performance problems, apparently involving huge amounts of redundant dictionary creation.

Edits:

  • 2008-06-04: Brief explanation of the associated data type declaration.

Linear maps

Semantically, a linear map is a function f :: a -> b such that, for all scalar values c and “vectors” u, v :: a, the following properties hold:

f (c *^ u)  == c *^ f u
f (u ^+^ v) == f u ^+^ f v

where (*^) and (^+^) are scalar multiplication and vector addition. (See VectorSpace details in a previous post.)

Although the semantics of a linear map will be a function, the representation will be a data structure.

data a :-* b = ...

The semantic function is

lapply :: (LMapDom a s, VectorSpace b s) =>
          (a :-* b) -> (a -> b)  -- result will be linear

The first constraint says that we know how to represent linear maps whose domain is the vector space a, which has the associated scalar field s. The second constraint say that b must be a vector space over that same scalar field.

Conversely, there is also a function to turn any linear function into a linear map:

linear :: LMapDom a s => (a -> b) -> (a :-* b)  -- argument must be linear

These two functions and the linear map data type are packaged up as the LMapDom type class:

-- | Domain of a linear map.
class VectorSpace a s => LMapDom a s | a -> s where
  -- | Linear map type
  data (:-*) a :: * -> *
  -- | Linear map as function
  lapply :: VectorSpace b s => (a :-* b) -> (a -> b)
  -- | Function (assumed linear) as linear map.
  linear :: (a -> b) -> (a :-* b)

The data definition means that the data type (a :-* b) (of linear maps from a to b) is has a variety of representations, each one associated with a type a.

These two conversion functions are required to be inverses:

{-# RULES
 
"linear.lapply"   forall m. linear (lapply m) = m
 
"lapply.linear"   forall f. lapply (linear f) = f
 
 #-}

Scalar domains

Consider a linear function f over a scalar domain. Then

f s == f (s *^ 1)
    == s *^ f 1  -- by linearity

Therefore, f is fully determined by its value at 1, and so an adequate representation of f is then simply the value f 1.

This observation leads to LMapDom instances like the following:

instance LMapDom Double Double where
  data Double :-* o  = DoubleL o
  lapply (DoubleL o) = (*^ o)
  linear f           = DoubleL (f 1)

Non-scalar domains

Maps over non-scalar domains are a little trickier. Consider a linear function f over a domain of pairs of scalar values. Then

f (a,b) == f (a *^ (1,0) ^+^ b *^ (0,1))
        == f (a *^ (1,0)) ^+^ f (b *^ (0,1))  -- linearity
        == a *^ f (1,0) ^+^ b *^ f (0,1)      -- linearity twice more

So f is determined by f (1,0) and f (0,1) and thus can be represented by those two values.

instance LMapDom (Double,Double) Double where
  data (Double,Double) :-* o = PairD o o
  PairD ao bo `lapply` (a,b) = a *^ ao ^+^ b *^ bo
  linear f = PairD (f (0,1)) (f (1,0))

and similarly for triples, etc.

This definition works fine, but I want something compositional. I’d like linear maps over pairs of pairs and so on.

Composing domains

We can still use part of our linearity property. Using zeroV as the zero vector for arbitrary vector spaces,

f (a,b) == f ((a,zeroV) ^+^ (zeroV,b))
        == f (a,zeroV) ^+^ f (zeroV,b)

We see that f is determined by its behavior when either argument is zero.

In other words, f can be reconstructed from two other functions over simpler domains:

fa a = f (a,zeroV)
fb b = f (zeroV,b)

If f :: (a,b) -> o, then fa :: a -> o and fb :: b -> o.

Exercise: show that fa and fb are linear if f is. We can thus reduce the problem of representing the linear function f to the problems of representing fa and fb. This insight is captured in the following LMapDom instance:

instance (LMapDom a s, LMapDom b s) => LMapDom (a,b) s where
  data (a,b) :-* o = PairL (a :-* o) (b :-* o)
  PairL ao bo `lapply` (a,b) = ao `lapply` a ^+^ bo `lapply` b
  linear f = PairL (linear (\ a -> f (a,zeroV)))
                   (linear (\ b -> f (zeroV,b)))

Of course, there are similar instances for triples, etc, as well as for tuple variants with strict fields, such as OpenGL’s Vector2 and Vector3 types.

What have we done?

If you’ve studied linear algebra, you may be thinking now about the idea of a basis of a vector space. A basis is a minimal set of vectors that can be combined linearly to cover the entire vector space. Any linear map is determined by its behavior on any basis. For scalars, the set {1} is a basis, while for pairs of scalars, {(1,0), (0,1)} is a basis. It is not just coincidental that exactly these basis vectors showed up in the definitions of linear for Double and (Double,Double).

In the general pairing instance of LMapDom above, bases are built up recursively. Each recursive call to linear results in a data structure that holds the values of fa over a basis for a and the values of fb over a basis for b. Each of those basis vectors corresponds to a basis vector for (a,b), by zeroV-padding.

The dimension of a vector space is the number of elements in a basis (which is independent of the particular choice of basis). For vector space types a and b,

dimension (a,b) == dimension a + dimension b

which corresponds to the fact that our linear map representation (as built by linear) contains samples for each basis element of a, plus samples for each basis element of b (all zeroV-padded).

Working with linear maps

Besides the instances above for creating and applying linear maps, what else can we do? For starters, let’s define the identity linear map. Since the identity function is already linear, simply convert it to a linear map:

idL :: LMapDom a s => a :-* a
idL = linear id

Another very useful tool is transforming a linear map by transforming a linear function.

inL :: (LMapDom c s, VectorSpace b s', LMapDom a s') =>
        ((a -> b) -> (c -> d)) -> ((a :-* b) -> (c :-* d))
inL h = linear . h . lapply

where the higher-order function h is assumed to map linear functions to linear functions.

We don’t have to stop at unary transformations of linear functions.

-- | Transform a linear maps by transforming linear functions.
inL2 :: ( LMapDom c s, VectorSpace b s', LMapDom a s'
        , LMapDom e s, VectorSpace d s ) =>
        ((a  -> b) -> (c  -> d) -> (e  -> f))
     -> ((a :-* b) -> (c :-* d) -> (e :-* f))
inL2 h = inL . h . lapply

The type constraints are starting to get hairy. Fortunately, they’re entirely inferred by the compiler.

Let’s do some inlining and simplification to see what goes on inside inL2:

inL2 h m n
  == (inL . h . lapply) m n                -- inline inL2
  == inL (h (lapply m)) n                  -- inline (.)
  == (linear . (h (lapply m)) . lapply) n  -- inline inL
  == linear (h (lapply m) (lapply n))      -- inline (.)

Ternary transformations are defined similarly. I’ll spare you the type constraints this time.

inL3 :: ( ... ) =>
        ((a  -> b) -> (c  -> d) -> (e  -> f) -> (p  -> q))
     -> ((a :-* b) -> (c :-* d) -> (e :-* f) -> (p :-* q))
inL3 h = inL2 . h . lapply

Look what happens when these operations are composed. As an example,

inL h . inL g
  == (linear . h . lapply) . (linear . g . lapply)
  == linear . h . lapply . linear . g . lapply    -- associativity of (.)
  == linear . h . g . lapply                      -- rule "lapply.linear"
  == inL (h . g)

This transformation is not actually happening in the compiler yet. The “lapply.linear” rule is not firing, and I don’t know why. I’d appreciate suggestions.

There are a few more operations defined in Data.LinearMap. I’ll end with this simple, general definition of composition of linear maps:

-- | Compose linear maps
(.*) :: (VectorSpace c s, LMapDom b s, LMapDom a s) =>
        (b :-* c) -> (a :-* b) -> (a :-* c)
(.*) = inL2 (.)

Derivative towers again

A similar, but recursive, definition is used in the new definition of the the general chain rule for infinite derivative towers, updated since the post Higher-dimensional, higher-order derivatives, functionally.

(@.) :: (LMapDom b s, LMapDom a s, VectorSpace c s) =>
        (b :~> c) -> (a :~> b) -> (a :~> c)
(h @. g) a0 = D c0 (inL2 (@.) c' b')
  where
    D b0 b' = g a0
    D c0 c' = h b0

2 Comments

  1. The Comonad.Reader » Memoizing Context:

    [...] Recently Andy Gill posted a nice use of data families to memoize a narrow range of values for the basis for Conal Elliott’s functional linear map. [...]

  2. Conal Elliott » Blog Archive » Simpler, more efficient, functional linear maps:

    [...] previous post described a data type of functional linear maps. As Andy Gill pointed out, we had a heck of a time [...]

Leave a comment