Optimizing CCCs

In the post Overloading lambda, I gave a translation from a typed lambda calculus into the vocabulary of cartesian closed categories (CCCs). This simple translation leads to unnecessarily complex expressions. For instance, the simple lambda term, “λ ds → (λ (a,b) → (b,a)) ds”, translated to a rather complicated CCC term:

apply ∘ (curry (apply ∘ (apply ∘ (const (,) △ (id ∘ exr) ∘ exr) △ (id ∘ exl) ∘ exr)) △ id)

(Recall from the previous post that (∘) binds more tightly than (△) and (▽).)

However, we can do much better, translating to

exr △ exl

which says to pair the right and left halves of the argument pair, i.e., swap.

This post applies some equational properties to greatly simplify/optimize the result of translation to CCC form, including example above. First I’ll show the equational reasoning and then how it’s automated in the lambda-ccc library.

Equational reasoning on CCC terms

First, use the identity/composition laws:

f ∘ id ≡ f
id ∘ g ≡ g

Our example is now slightly simpler:

apply ∘ (curry (apply ∘ (apply ∘ (const (,) △ exr ∘ exr) △ exl ∘ exr)) △ id)

Next, consider the subterm apply ∘ (const (,) △ exr ∘ exr):

  apply ∘ (const (,) △ exr ∘ exr)
≡ {- definition of (∘)  -}
  λ x  apply ((const (,) △ exr ∘ exr) x)
≡ {- definition of (△) -}
  λ x  apply (const (,) x, (exr ∘ exr) x)
≡ {- definition of apply -}
  λ x  const (,) x ((exr ∘ exr) x)
≡ {- definition of const -}
  λ x  (,) ((exr ∘ exr) x)
≡ {- η-reduce -}
  (,) ∘ (exr ∘ exr)

We didn’t use any properties of (,) or of (exr ∘ exr), so let’s generalize:

  apply ∘ (const g △ f)
≡ λ x  apply ((const g △ f) x)
≡ λ x  apply (const g x, f x)
≡ λ x  const g x (f x)
≡ λ x  g (f x)
≡ g ∘ f

(Note that I’ve cheated here by appealing to the function interpretations of apply and const. Question: Is there a purely algebraic proof, using only the CCC laws?)

With this equivalence, our example simplifies further:

apply ∘ (curry (apply ∘ ((,) ∘ exr ∘ exr △ exl ∘ exr)) △ id)

Next, lets focus on apply ∘ ((,) ∘ exr ∘ exr △ exl ∘ exr). Generalize to apply ∘ (h ∘ f △ g) and fiddle about:

  apply ∘ (h ∘ f △ g)
≡ λ x  apply (h (f x), g x)
≡ λ x  h (f x) (g x)
≡ λ x  uncurry h (f x, g x)
≡ uncurry h ∘ (λ x  (f x, g x))
≡ uncurry h ∘ (f △ g)

Apply to our example:

apply ∘ (curry (uncurry (,) ∘ (exr ∘ exr △ exl ∘ exr)) △ id)

We can simplify uncurry (,) as follows:

  uncurry (,)
≡ λ (x,y)  uncurry (,) (x,y)
≡ λ (x,y)  (,) x y
≡ λ (x,y)  (x,y)
≡ id

Together with the left identity law, our example now becomes

apply ∘ (curry (exr ∘ exr △ exl ∘ exr) △ id)

Next use the law that relates (∘) and (△):

f ∘ r △ g ∘ r ≡ (f △ g) ∘ r

In our example, exr ∘ exr △ exl ∘ exr becomes (exr △ exl) ∘ exr, so we have

apply ∘ (curry ((exr △ exl) ∘ exr) △ id)

Let’s now look at how apply, (△), and curry interact:

  apply ∘ (curry h △ g)
≡ λ p  apply ((curry h △ g) p)
≡ λ p  apply (curry h p, g p)
≡ λ p  curry h p (g p)
≡ λ p  h (p, g p)
≡ h ∘ (id △ g)

We can add more variety for other uses:

  apply ∘ (curry h ∘ f △ g)
≡ λ p  apply ((curry h ∘ f △ g) p)
≡ λ p  apply (curry h (f p), g p)
≡ λ p  curry h (f p) (g p)
≡ λ p  h (f p, g p)
≡ h ∘ (f △ g)

With this rule (even in its more specialized form),

apply ∘ (curry ((exr △ exl) ∘ exr) △ id)

becomes

(exr △ exl) ∘ exr ∘ (idid)

Next use the universal property of (△), which is that it is the unique solution of the following two equations (universally quantified over f and g):

exl ∘ (f △ g) ≡ f
exr ∘ (f △ g) ≡ g

(See Calculating Functional Programs, Section 1.3.6.)

Applying the second rule to exr ∘ (id △ id) gives id, so our swap example becomes

exr △ exl

Automation

By using a collection of equational properties, we’ve greatly simplified our CCC example. These properties and more are used in LambdaCCC.CCC to simplify CCC terms during construction. As a general technique, whenever building terms, rather than applying the GADT constructors directly, we’ll use so-called “smart constructors” with built-in optimizations. I’ll show a few smart constructor definitions here. See the LambdaCCC.CCC source code for others.

As a first simple example, consider the identity laws for composition:

f ∘ id ≡ f
id ∘ g ≡ g

Since the top-level operator on the LHSs (left-hand sides) is (∘), we can easily implement these laws in a “smart constructor” for (∘), which handles special cases and uses the plain (dumb) constructor if no simplifications apply:

infixr 9 @∘
(@∘)  (b ↣ c)  (a ↣ b)  (a ↣ c)
⋯ -- simplifications go here
g @∘ f  = g :∘ f

where is the GADT that represents biCCC terms, as shown in Overloading lambda.

The identity laws are easy to implement:

f @Id = f
Id @∘ g = g

Next, the apply/const law derived above:

apply ∘ (const g △ f) ≡ g ∘ f

This rule translates fairly easily:

Apply @∘ (Const g :△ f) = prim g @∘ f

where prim is a smart constructor for Prim.

There are some details worth noting:

  • The LHS uses only dumb constructors and variables except for the smart constructor being defined (here (@∘)).
  • Besides variables bound on the LHS, the RHS uses only smart constructors, so that the constructed combinations are optimized as well. For instance, f might be Id here.

Despite these details, this definition is inadequate in many cases. Consider the following example:

apply ∘ ((const u △ v) ∘ w)

Syntactically, the LHS of our rule does not match this term, because the two compositions are associated to the right instead of the left. Semantically, the rules does match, since composition is associative. In order to apply this rule, we can first left-associate and then apply the rule.

We could associate all compositions to the left during construction, in which case this rule will apply purely via syntactic matching. However, there will be other rewrites that require right-association in order to apply. Instead, for rules like this one, let’s explicitly left-decompose.

Suppose we have a smart constructor composeApply g that constructs an optimized version of apply ∘ g. This equivalence implies the following type:

composeApply  (z ↣ (a ⇨ b) × a)  (z ↣ b)

Thus

  apply ∘ (g ∘ f)
≡ (apply ∘ g) ∘ f
≡ composeApply g ∘ f

Now we can define a general rule for composing apply:

Apply @∘ (decompL  g :∘ f) = composeApply g @∘ f

The function decompL (defined below) does a left-decomposition and is conveniently used here in a view pattern. It decomposes a given term into g ∘ f, where g is as small as possible, but not Id. Where decompL finds such a decomposition, it yields a term with a top-level (:∘) constructor, and composeApply is used. Otherwise, the clause fails.

The implementation of decompL:

decompL  (a ↣ c)  (a ↣ c)
decompL Id                        = Id
decompL ((decompL  h :∘ g) :∘ f) = h :∘ (g @∘ f)
decompL comp@(_ :∘ _)             = comp
decompL f                         = f :Id

There’s also decompR for right-factoring, similarly defined.

Note that I broke my rule of using only smart constructors on RHSs, since I specifically want to generate a (:∘) term.

With this re-association trick in place, we can now look at compose/apply rules.

The equivalence

apply ∘ (const g △ f) ≡ g ∘ f

becomes

composeApply (Const p :△ f) = prim p @∘ f

Likewise, the equivalence

apply ∘ (h ∘ f △ g) ≡ uncurry h ∘ (f △ g)

becomes

composeApply (h :∘ f :△ g) = uncurryE h @∘ (f △ g)

where (△) is the smart constructor for (:△), and uncurryE is a smart constructor for Uncurry:

uncurryE  (a ↣ (b ⇨ c))  (a × b ↣ c)
uncurryE (Curry f)    = f
uncurryE (Prim PairP) = Id
uncurryE h            = Uncurry h

Two more (∘)/apply properties:

  apply ∘ (curry (g ∘ exr) △ f)
≡ λ x  curry (g ∘ exr) x (f x)
≡ λ x  (g ∘ exr) (x, f x)
≡ λ x  g (f x)
≡ g ∘ f
  apply ∘ first f
≡ λ p  apply (first f p)
≡ λ (a,b)  apply (first f (a,b))
≡ λ (a,b)  apply (f a, b)
≡ λ (a,b)  f a b
≡ uncurry f

The first combinator is not represented directly in our (↣) data type, but rather is defined via simpler parts in LambdaCCC.CCC:

first  (a ↣ c)  (a × b ↣ c × b)
first f = f × Id

(×)  (a ↣ c)  (b ↣ d)  (a × b ↣ c × d)
f × g = f @Exl △ g @Exr

Implementations of these two properties:

composeApply (Curry (decompR  g :Exr) :△ f) = g @∘ f

composeApply (f :Exl :Exr) = uncurryE f

These properties arose while examining CCC terms produced by translation from lambda terms. See the LambdaCCC.CCC for more optimizations. I expect that others will arise with more experience.

4 Comments

  1. Derek Elkins:

    I think your difficulty with apply/const comes from the incoherence of const in this context; it adds a kind of “well-pointedness” that goes beyond a CCC and goes beyond what you need for at least this example. In a nutshell, it doesn’t make sense to say “const x” in general since there is no way to talk about elements, x, “in” some object. However, category theory does let us talk about arrows and that leads to what is sometimes called the “name” of an arrow (in a CCC). Specifically, const f = curry (f ∘ exr) which should look familiar. From there, it’s easy using the universal property of apply (as a counit of the relevant adjunction) which is: apply ∘ (curry f × id) = f. You get:

    apply ∘ (const g △ f)
    = apply ∘ (curry (g ∘ exr) △ f)             -- expand definition of const
    = apply ∘ (curry (g ∘ exr) × id) ∘ (id △ f) -- naturality of fork, tacitly adding some ids
    = g ∘ exr ∘ (id △ f)                        -- universal property of apply
    = g ∘ f                                     -- (half of) universal property of pairs
    
  2. Dan Krejsa:

    A very slightly different take expanding a bit on Derek Elkin’s comment. Letting 1 (a.k.a. Unit) be the terminal object of the CCC, one can consider morphisms (1 ↣ b) as “elements of b” in some sense. Then one can have a function corresponding to ‘const’ applying to CCC morphisms:

    const :: (1 ↣ b) -> (a ↣ b)
    const e = e ∘ it

    where it :: a ↣ 1 is the unique morphism from a to 1.

    Now given CCC morphisms g :: b ↣ c and f :: a ↣ b, we may view

    g ∘ exr :: (1 × b) ↣ c

    and let

    g' = curry (g ∘ exr) :: 1 ↣ (b ⇨ c)

    Then

    const g' :: a ↣ (b ⇨ c)
    const g' = g' ∘ it = curry (g ∘ exr) ∘ it

    and we have

    apply ∘ (const g' △ f)
      = apply ∘ (curry (g ∘ exr) ∘ it △ f)
      = apply ∘ (curry (g ∘ exr) ∘ it △ id ∘ f)
      = apply ∘ (curry (g ∘ exr) ∘ exl △ id ∘ exr) ∘ (it △ f)
      = apply ∘ (curry (g ∘ exr) ∘ exl △ exr) ∘ (it △ f)
      = (g ∘ exr) ∘ (it △ f)
      = g ∘ (exr ∘ (it △ f))
      = g ∘ f

    using the defining property of ‘curry’ that for any f : c × a -> b,

    f = apply ∘ (curry f ∘ exl △ exr)

    and the result

    h ∘ f △ k ∘ g = (h ∘ exl △ k ∘ exr) ∘ (f △ g)

  3. conal:

    Derek & Dan: Thanks very much for the suggestions to eliminate const, which felt shaky indeed!

  4. Conal Elliott » Blog Archive » Circuits as a bicartesian closed category:

    […] previous few posts have been about cartesian closed categories (CCCs). In From Haskell to hardware via cartesian […]

Leave a comment