Konstantin Nazarov

Capturing and serializing continuations

With latest patches to Valeri it became possible to do some magic that you just can't do in Python. At least not without creating lots of tricky edge cases. Because of Valeri's immutability, objects in memory always form a unidirectional graph. The same is true for virtual machine runtime. This means that it is possible to take current state of computation, serialize it to a contiguous byte array, send over the network and then resume the computation on the other end.

Because of the referential transparency, objects in memory don't have any "identity" other than their contents. So if they are equal, you can substitute one with the other and nothing will change. Let's look at the example, and then I'll break it down:

(def c (guard (fn body () (+ 1 (raise :unused)))
              (fn handler (err cont) cont)))

;; "serialized" will hold the detached representation
;; of the continuation "c"
(def serialized (serialize c))
(println "serialized: " serialized "\n")

(def deserialized (deserialize serialized))
(println "deserialized:	" deserialized "\n")

;; Call the deserialized continuation
(println "result: " (deserialized 2))

First, let's start with guard. It does pretty much the same thing as exception handlers do in other languages. Its prototype is (guard <function> <exception-handler>). It calls the <function> and if an exception occurs during the execution, it calls the <exception-handler>. Exception handler takes two arguments: first is the exception object itself, and the second is the "continuation" object. The only difference with typical programming languages is that by calling the "continuation" object, you will jump back to where the exception has occured, and the (raise) function will return the value passed to the continuation as a parameter.

In this example, we suspend execution right at the time the second argument to addition is computed. If the continuation is called with an integer parameter - it will be used in this addition. So effectively, you can think that the continuation contains a function roughly equivalent to (fn (x) (+ 1 x)). For simplicity, we have a very simple function here. But it can in fact be arbitrarily nested and use recursion. In such case the whole call stack starting from the "guard" will be preserved.

Because in the handler function we immediately return the continuation object - it will become the result of the whole guard statement and then assigned to variable c. We can then repeatedly call c with different arguments like (c 5) to get 6. The number of times we call it doesn't matter: every time execution will be "forked off" from the point where the original computation has been suspended.

Next in the example we have this:

(def serialized (serialize c))
(println "serialized: " serialized "\n")

Here, (serialize c) takes the computation state graph in memory and flattens it into the byte array (possible due to the graph being unidirectional). When executed, it will give you something like this:

serialized:  #<bytearray 1200076003000608000710000740000...>
;; Some of the output was trimmed to save screen space

This flattened representation is computed by getting hold of c as a "root" and then traversing the graph, copying and compacting every object traversed into the contiguous area in memory. Because memory addressing in Valeri is offset-based, there is no separate format for the serialized data. Data structures in memory are exactly the same as in the serialized form.

Then we deserialize the byte array back:

(def deserialized (deserialize serialized))
(println "deserialized:	" deserialized "\n")

When executed, this code gives us:

deserialized:  #<continuation>

It's important to reiterate that there is no separate format for serialized data. Because of this, "deserialization" just involves copying the content of the byte array into the heap, and reinterpreting the beginning of that data as the beginning of the resulting data structure. Of course I'll do some validation in the future so that you can't corrupt the runtime by handcrafting the byte array, but the general approach will stay the same.

And finally, we execute the deserialized continuation:

(println "result: " (deserialized 2))

Which gives us:

result:  3

And of course we can easily run it again:

(println "result: " (deserialized 42))
result:  43

This execution model is very flexible. By using it, you can save the full "image" of the running program, or only part of it. Or send computation over the network to be performed on the "other side", where that "other side" can be reasonably certain that the computation won't be able to do anything it's not supposed to do.

Another interesting thing you can do with this is "execution snapshotting", where you can save state of your program at a certain point in time, and then revert back to it if you are debugging or otherwise reproducing unintended behavior.