Garbage collecting the semantics of FRP

Ever since ActiveVRML, the model we’ve been using in functional reactive programming (FRP) for interactive behaviors is (T->a) -> (T->b), for dynamic (time-varying) input of type a and dynamic output of type b (where T is time). In “Classic FRP” formulations (including ActiveVRML, Fran & Reactive), there is a “behavior” abstraction whose denotation is a function of time. Interactive behaviors are then modeled as host language (e.g., Haskell) functions between behaviors. Problems with this formulation are described in Why classic FRP does not fit interactive behavior. These same problems motivated “Arrowized FRP”. In Arrowized FRP, behaviors (renamed “signals”) are purely conceptual. They are part of the semantic model but do not have any realization in the programming interface. Instead, the abstraction is a signal transformer, SF a b, whose semantics is (T->a) -> (T->b). See Genuinely Functional User Interfaces and Functional Reactive Programming, Continued.

Whether in its classic or arrowized embodiment, I’ve been growing uncomfortable with this semantic model of functions between time functions. A few weeks ago, I realized that one source of discomfort is that this model is mostly junk.

This post contains some partially formed thoughts about how to eliminate the junk (“garbage collect the semantics”), and what might remain.

There are two generally desirable properties for a denotational semantics: full abstraction and junk-freeness. Roughly, “full abstraction” means we must not distinguish between what is (operationally) indistinguishable, while “junk-freeness” means that every semantic value must be denotable.

FRP’s semantic model, (T->a) -> (T->b), allows not only arbitrary (computable) transformation of input values, but also of time. The output at some time can depend on the input at any time at all, or even on the input at arbitrarily many different times. Consequently, this model allows respoding to future input, violating a principle sometimes called “causality”, which is that outputs may depend on the past or present but not the future.

In a causal system, the present can reach backward to the past but not forward the future. I’m uneasy about this ability as well. Arbitrary access to the past may be much more powerful than necessary. As evidence, consult the system we call (physical) Reality. As far as I can tell, Reality operates without arbitrary access to the past or to the future, and it does a pretty good job at expressiveness.

Moreover, arbitrary past access is also problematic to implement in its semantically simple generality.

There is a thing we call informally “memory”, which at first blush may look like access to the past, it isn’t really. Rather, memory is access to a present input, which has come into being through a process of filtering, gradual accumulation, and discarding (forgetting). I’m talking about “memory” here in the sense of what our brains do, but also what all the rest of physical reality does. For instance, weather marks on a rock are part of the rock’s (present) memory of the past weather.

A very simple memory-less semantic model of interactive behavior is just a -> b. This model is too restrictive, however, as it cannot support any influence of the past on the present.

Which leaves a question: what is a simple and adequate formal model of interactive behavior that reaches neither into the past nor into the future, and yet still allows the past to influence the present? Inspired in part by a design principle I call “what would reality do?” (WWRD), I’m happy to have some kind of infinitesimal access to the past, but nothing further.

My current intuition is that differentiation/integration plays a crucial role. That information is carried forward moment by moment in time as “momentum” in some sense.

I call intuition cosmic fishing. You feel a nibble, then you’ve got to hook the fish. – Buckminster Fuller

Where to go with these intuitions?

Perhaps interactive behaviors are some sort of function with all of its derivatives. See Beautiful differentiation for an specification and derived implementation of numeric operations, and more generally of Functor and Applicative, on which much of FRP is based.

I suspect the whole event model can be replaced by integration. Integration is the main remaining piece.

How weak a semantic model can let us define integration?

Thanks

My thanks to Luke Palmer and to Noam Lewis for some clarifying chats about these half-baked ideas. And to the folks on #haskell IRC for brainstorming titles for this post. My favorite suggestions were

  • luqui: instance HasJunk FRP where
  • luqui: Functional reactive programming’s semantic baggage
  • sinelaw: FRP, please take out the trash!
  • cale: Garbage collecting the semantics of FRP
  • BMeph: Take out the FRP-ing Trash

all of which I preferred over my original “FRP is mostly junk”.

34 Comments

  1. Noam Lewis:

    So what exactly are the undenotable semantic values in FRP?

  2. Twan van Laarhoven:

    A function of type “(T->a) -> (T->b)” only allows arbitrary transformation of time (and hence access to the future) if there is a way to transform values of type T. If T were some abstract type then a behavior could only access the present, or perhaps by storing a T value, the past at that particular time.

    This could perhaps be formalized by quantifying over T, so “type SF a b = forall t. (t -> a) -> (t -> b)”. Of course this model disallows any kind of interesting time varying behavior, so perhaps t should be an instance of some type class “Time t”.

  3. Sjoerd Visscher:

    Would (T->a) -> (T->b) be sufficient if it is not possible to create values of type T? Then the only thing you can do with (T->a) is to pass it the T from (T->b), i.e. the current time.

  4. conal:

    Wow — Twan & Sjoerd are making the same suggestion. Did you two talk?

    Just to be clear, T is a semantic type. I.e. your idea depends on the semantic type being abstract.

    I’m intrigued with Twan’s specific formalization via universal quantification. I wonder what could be in the Time type class that would allow integration but disallow time travel.

  5. Steven Fodstad:

    I’m talking about “memory” here in the sense of what our brains do, but also what all the rest of physical reality does. For instance, weather marks on a rock are part of the rock’s (present) memory of the past weather.

    Future weather doesn’t mar present rocks. You make a good point that the past should be discarded by default, but past and future are still not symmetric: you can accumulate evidence of the past, but not the future. I suggest we define causality such that both unfettered access and accumulations in the present are causal; on the other hand, accumulations in the present of future values (or any other method of impacting the present) would not be causal.

  6. Paul Liu:

    Hi Conal, you might be interested to take a look at our recent work on a particular space leak problem originated from the “tower of derivatives”, which leads to a solution based on CCA (Causal Commutative Arrows). It somewhat relates to the idea of AD, but from another angle. It is available on my home page http://www.cs.yale.edu/homes/hl293/

    Coming back to your question on, if we choose the init operator in CCA to be delay of a single unit, it fits dataflow or discrete stream model. But if we choose init to be integral, it fits the continuous model in FRP. CCA is abstract enough to avoid the actual semantic model but still capture some commonalities.

    The problem with Functor or Applicative is that they are insufficient, and for similar reasons Arrow is insufficient. Comonad actually is a more interesting abstraction that captures a stream with a moving cursor. But still, it cannot say anything about past or future without a more domain specific operator. This is where our work on CCA comes from.

    But CCA as it stands right now, still won’t fulfill your quest since the product law by itself doesn’t say anything about past or future, not even time itself. Causality is not fully captured in CCA (but I’d say a weak notion of causality is). I feel that talking about time (or whatever nature of the discrete or continuous stream that time is) is unavoidable if causality is really your goal.

  7. sclv:

    The question is first, I think, what does integration mean in a semantic context? F a at t = (a at t’, a at t) ? I think this is too powerful. Integration preserves not history, but some summary trace of it — intuitively I think of it as the limit of a fold. Integration by a discrete timestep at least could maybe be F a at t dt -> f (a at t) (a at (t + dt)) dt. In the numeric case, little f is a trapezoidal estimate or the like. In the general case, f has to be a function of both a and t, which I think is the key realization. Also, we know what it means to pass from the discrete to the continuous formulation for the numeric case, but not in general. Looking to differentiating data types could be the answer here — the f for an adt produces a “script” to transform it, and differentiating yields the hole remaining?

  8. sclv:

    One other note: what if we stick with a -> b but provide a function F a :: a -> a’ such that a’ is the integral of a? Time falls out of the picture entirely (or rather, it exists only for F). So the meaning of time is relative to the types of values we’re working with.

  9. Janos Kramar:

    I guess the only kinds of transformations trans:(T->a)->(T->b) you wish to allow are those for which, for all t1<t2, (trans f) t2 is completely determined by (trans f) t1 and by f t for t1<=t<=t2. (Or maybe even t1<=t<t2!) Maybe this is worth spelling out; though I’m not sure what to do with it. But in this formulation it seems to require only that T be totally ordered, so perhaps integration per se isn’t essential.

  10. Patai Gergely:

    I have only a little worry about this WWRD principle. To oversimplify matters, there are two worlds to work in, which I’ll sloppily refer to ‘numeric’ and ‘symbolic’. For instance, neurons belong to the first category, while a large part of our consciousness to the second, with a huge bit of ‘magic happens here’ in the middle. It seems to me that you equate the R in WWRD with the numeric world, but not the symbolic one. However, I believe that events belong to the latter, since they are basically efficient abstractions over complex phenomena in the numeric layer. And that’s true in general: while both worlds can simulate each other, this simulation is extremely inefficient because they provide completely different abstractions. So either you need to describe both, or go for the magic bit.

  11. Noam Lewis:

    I mean, why do you call past and future access “junk”? The FRP semantic model can and does denote that, albeit implicitly, unless I’m missing something. Behaviors are a function: Time -> a , and in no way does the model restrict uses of the ‘at’ function to the “current” time, which is not even defined.

    Could you mean that there are denotable things we don’t want in the model at all? That is related to what Twan & Sjoerd talk about.

  12. conal:

    Behaviors are a function: Time -> a , and in no way does the model restrict uses of the ‘at’ function to the “current” time, which is not even defined.

    Could you mean that there are denotable things we don’t want in the model at all?

    Hm. I think you answered your own question here. Which means I left something unclear in my post.

    As you said, the model includes meanings like \ f -> \ t -> f (t + 5), in which the signal f is transformed in time. I’d like such meanings to be non-denotable (junk). The idea here is that signals are live input & output and so cannot be time-transformed.

  13. sclv:

    Another small note — assume we only have a -> b, and integ :: a -> a’ (and I suppose a diff as well). We can still define t as integ 1. So everything that can be written with (T -> a) -> (T -> b) can, I think, be trivially translated. I suspect this is still a win, but the reason why is now somewhat more subtle.

  14. Derek Elkins:

    Some random references.

    The Daniell Integral is an axiomatic characterisation of the (analytic) integral. It’s almost certainly overly specialized for your uses, but it may be handy to look at its definition.

    Your description of memory makes me think of the concept of stigmergy.

    Probably most relevant/direct is Synthetic Differential Geometry where there is a directly a notion of an “infinitesimal time interval.” Call that interval T, then an arrow T -> X, where X is some (synthetic) manifold, is a vector. Thus the whole function space T -> X is a vector field. The interesting thing is the functor (T ->) has a right adjoint that represents an integral flow.

  15. Jake McArthur:

    If we don’t export anything exposing time, such as time :: Behavior T, then do we still suffer from the same problem? To be clear, this is distinct from exposing T as an abstract type.

  16. Tim:

    In the spirit of WWND consider the following setup inspired by the rational agent architecture which can be found in Russell and Norvig.

    A system involving an intelligent agent consists of the environment e and an agent, h. Think of these as two functions where h is represented by haskell code (of course) and e may or may not be. To evolve the system, e provides a value to h, in the form of observable state, s :: S. The state variable might be considered a tuple containing all the values h can access. Notice there is no time index. h takes the state and selects an action a :: A, which e uses to evolve the state. This simple description describes a pair (h :: S -> A , e :: S -> A -> S) and the behavior of the environment and agent together evolves like this: iterate (e . h) s0

    There is no restriction on how much or little of the past the agent, h, can observe including its own actions. These are design choices. There is no restriction on whether e is deterministic or not. It depends on the problem. In agent-design-land we like to think e is probabilistic with an unknown distribution and are interested in the (probability) space S x A x S to evaluate the suitability of the agent for its environment, which is a different calculation. This space need not affect the definition of h, but we sometimes like to think of our agents as little versions of ourselves, who are interested in the same space, subject to different cognitive limits. In one design, the agent might try to make a decision by keeping some record of S x A x S and updating its own estimate of the associated probability table. In choosing the best action (in the definition of h), it is quite likely that a conditional expectation, that is, an integral over the second s in this table will be useful. As an aside, one can even usefully adopt the point of view that designing a conditional expectation operator without direct reference to probability is a more fundamental starting point for agent design.

    What about time? Time is only implicit here and it seems to me that any statements concerning time would have to enforced as restrictions on the possible traces of such a program stated above. In a setup where e is deterministic, it is conceivable that h will be able to infer and use forecasts of future values of s somewhere in its definition. If e turns out not to be deterministic, these forecasts won’t be correct some runs of the program. So h can act as if it has foresight, but that doesn’t make it so. The question of whether h can depend on the future will therefore reduced to the ability of h (and its designer) to predict the future.

    So much for one view of Nature. Does this connect back to FRP in any way? I’m still thinking, but if time is actually a summary of how states can be ordered in a trace, then any restrictions imposed by traces would be natural and prevent perfect foresight in situations when it isn’t possible. However the idea of expectation (integration!) is still very useful.

    Gee, I hope some of this made sense.

  17. Derek Elkins:

    memory = state

    Thinking about notions that encompass discrete and continuous state leads me to control/system theory. One particularly interesting person in that area is Jan C. Willems. Particularly, the papers In control, almost from the beginning until the day after tomorrow, a autobiographical retrospective, and A behavioral approach to open and inteconnected systems, an introductory article on his particular approach, are, in that order, good starting points into his work.

  18. conal:

    Jake: Do you mean keep the semantic model as (T->a)->(T->b) and shrink the API? If so, we’d only get more junk, not less. Do you mean something else?

  19. Jake McArthur:

    Here is a quote from Luke Palmer’s Semantic Design blog post:

    But it is reasonable not to expose the full power of the semantic domain through the code interface. You can imagine that if this image library is for drawing on hardware using OpenGL, then supporting an arbitrary function from reals to colors may be a speed nightmare, so we just won’t expose that. The operations we expose simply need be a subset of the operations expressible on the domain.

    Maybe I misunderstand the meaning of junk, but I wasn’t under the impression that junk-freeness conflicts with selective exposure of the model. Do you know of any literature that explains the meaning of junk-free in a bit more depth? Google isn’t helping me much.

  20. Derek Elkins:

    For Jake: “Junk-freeness” is usually called definability or full completeness in denotational semantics. The “junk” terminology comes from, primarily, universal algebra and algebraic semantics. There the motto is “no confusion, no junk.”

    Junk is exactly the stuff that the model(=semantics) has for which there is no interface(=syntax.) So “junk-freeness” conflicts with selective exposure pretty much by definition.

    Here is a trivial example of a semantics with junk. Say we have a syntax for addition expressions, Exp ::= NATURAL | Exp + Exp. If our denotation function has type Exp -> Integer and has the “standard” definition then all the negative integers are junk because our denotation function is not surjective. There are two main ways to fix this. We can 1) cut down the semantic domain, e.g. using the naturals instead of the integers or 2) we can expand the syntax so that the extra elements, the junk, becomes expressible, e.g. we can add negation or subtraction. The latter route isn’t always possible, for example, if our semantic domain had been the reals then there is no way not to have junk without allowing infinite syntax.

    Junk is inversely proportional to precision; the more junk you have, the less precision your semantics has.

  21. Yair Chuchem:

    Hey Conal,

    I think I may have a nice model that doesn’t allow future access and junk.

    data Program a b = Program { progVals :: [b] , progMore :: Maybe (a -> Program a b) }

    It’s kinda a signal transformer though.

    It’s available in hackage package “peakachu”.

    You can also cabal install DefendTheKing for an example RTS game implemented with it.

  22. conal:

    I think I may have a nice model that doesn’t allow future access and junk.

    data Program a b = Program { progVals :: [b] , progMore :: Maybe (a -> Program a b) }

    It’s kinda a signal transformer though.

    Looks like a Moore machine type but with multiple outputs instead of one. Why [b] instead of b? For a Mealy counterpart, see the MutantBot type at Functional reactive partner dancing. (More variations in other bot articles.)

    I’m not seeing continuous time here. Maybe Program could be composed with a simple type of continuous, non-reactive time functions, as in Push-pull functional reactive programming.

    Is the Maybe there so you know you’re in a final state? If so, and if you don’t really need to know, you could simplify away the Maybe and have all inputs lead back to the same interactive behavior.

    Since your Program is not syntactic, maybe there’s a more fitting name to be found.

    Have you thought about how to combine machines that have different input types?

  23. Yair Chuchem:

    Why [b] instead of b?

    I’m using an alternative to arrows where in merging machines I result with a sum-type result and not a product-type. A list of a sum-type allows to propagate an update on only some values while with product-types it’s hard to tell which has changed.

    I’m not seeing continuous time here.

    Yeah, you are right. Peaker was bugging me about it too. I should re-read your post about why continuous time matter because you guys tend to be right about these things :)

    Is the Maybe there so you know you’re in a final state

    Yes.

    If so, and if you don’t really need to know, you could simplify away the Maybe and have all inputs lead back to the same interactive behavior.

    I do it because I support the merging two machines by running them in sequence and for that I need to know when the machine ends.

    Have you thought about how to combine machines that have different input types?

    I combine these machines by having an input sum-type. I also have a counterpart to Program called Backend which is merge-able too. My example game and font editor have inputs from and outputs to several backends (GLUT, File, network, etc.)

    btw, did you try running the game? :)

  24. conal:

    @Yair: Thanks for the explanations. I like the idea of shifting from products to sums in order to track changes finely. I’ve made some similar explorations, plus a possibly-related post Pairs, sums, and reactivity.

    Great to have people exploring this area.

    No, I haven’t tried your game yet.

  25. Beliefs and the Unimpossibility of Functional Reactive Programming « Luke Palmer:

    [...] and the rest of the FRP gang has been discussing the semantic junk in classic FRP. Clearly there is more (or rather, less) to an interactive Behavior than an [...]

  26. Frederick Ross:

    I’ve been playing with this and do you really need integration? I feel like you might make something of unification. For example, say you have two text boxes and a button that’s supposed to swap the values in them,

    (val1, val2, True) :- (val2, val1, False) x (textBox, textBox, button).

    where you unify forwards in time, and the program makes some attempt to use the latest state possible for any given unification.

    I suppose it’s not really FRP if you do it that way, though…

  27. Henrik Nilsson:

    Hi Conal,

    You write:

    Instead, the abstraction is a signal transformer, SF a b, whose semantics is (T->a) -> (T->b). See Genuinely Functional User Interfaces and Functional Reactive Programming, Continued.

    Note that the Yampa papes always insisted this was just a conceptual definition to convey the basic intuitions, a first approximation: Yampa’s signal functions were always causal by construction, which the FRP Continued paper does state explicitly, and the reason was preciciely to rule out the “junk”, i.e. the signal function we cannot hope to implement in a reactive setting where the input is only revealed as time progresses. The approximate nature of this intuitive definition of signal functions was made even more explicit in later papers by using the “apprixmately equal” symbol in the definition, and even later papers by Neil Sculthorpe and myself has elaboated further on the point of casuality (and other useful temporal properties).

  28. Jason Stumpf:

    Working on my own I came up with basically the same thing as Yair, but without anything in hackage yet.

    data Reaction t a b = Reaction [(t,b)] ((t,a) -> Reaction t a b)

    t is the type of time. A Reaction has [(t,b)] the events that would come out if no events went in (where each t is the time that has passed since the last event, or since this reaction began in the case of the first event), and ((t,a) -> Reaction t a b) how it responds to an event after a certain amount of time.

    I’m tentatively considering continuous time as events of behaviors, with behaviors represented as (t -> b) functions. When the event happens, the continuous behavior begins, when the next event happens, a new continuous behavior begins. This does allow one to use the future and the past, but only as though no events happened or will happen. The way I see it, this is similar to Conal’s idea of nature, since from a tower of derivatives you can get the behavior function to high precision at any time, but not through discontinuities. If a stone is thrown, we know where it will land, unless a bird grabs it.

  29. Point-wise FRP, plus memory – Part 1 « Noam Lewis:

    [...] FRP, plus memory – Part 1 By sinelaw After Conal’s blog post, I’ve been thinking about possible models of FRP that satisfy the “no arbitrary time [...]

  30. Wolf Tivy:

    In control theory, we use the Laplace Transform to turn time (t) domain integral and differential equations into frequency (s) domain “transfer functions”. There is some amount of non-causality in the definition of s and the Laplace Transform, as it involves integration over t=0..inf, but I think it is is very easy to restrict.

    In the time domain, differentiation and integration are difficult to compute and composing systems involves the usual gymnastics. In the s domain however, a differentiator is just D(s)=s, an integrator is I(s)=1/s, and composition is just multiplication.

    The point of all this is that the s-domain somehow encapsulates all derivatives and integrals into one complex number to allow any causal system to be defined as a simple Complex->Complex. This seems to be what you are looking for, specifically the “behaviors are some sort of function with all of its derivatives” part.

    I have been trying to apply these (control theory) ideas to FRP ever since I discovered FRP. The kinks are in a) Trying to connect s-domain transfer functions to time domain data streams (I think MATLAB (simulink) does it somehow). b) Extending the concept to arbitrary data (not just vectors of complex). And c) Somehow coping with dynamically determined transfer function networks. If you could work out these issues I think you would have a the right model for FRP.

    The laplace transform seems to deal with discontinuities fairly well; the laplace transform of the delayed unit step is just Del(s)=exp(s-a) or something, though again, how to connect this to the outside world is unknown to me.

    I think it would be a good idea to look into control theory for some ideas, even if the laplace transform and the s domain cant be used directly. The concept of using arbitrary access to past and future (at least in an abstract denonational sense) and then encapsulating it into some parameter might be useful for FRP.

    I hope this is helpful. Good Luck.

  31. Lambdor:

    In game development I think we would very much like to go forward and backward in time and skew it (see games like Braid, the new Prince of Persia, Max Payne etc.) but maybe this is a totally different aspect which should be covered by some sort of “recording”.

  32. conal:

    Lambdor: I’m with you on both counts. For some related remarks, see the post Why classic FRP does not fit interactive behavior.

  33. A Model for Denotative Continuous-Time Programming « Jake McArthur:

    [...] Garbage Collecting the Semantics of FRP, Conal muses over some of the issues with the semantics of DCTP. He explains that both classic and [...]

  34. Sjoerd Visscher:

    You might be interested in this: http://www.reddit.com/r/haskell/comments/mj23v/new_video_lecture_how_to_be_more_productive/

    I wonder if it is possible (or makes sense even) to apply the Banach fixed-point theorem to the continuous time model.

Leave a comment