Purity, delimited continuations and I/O #
In this post I'll explain how I deal with I/O and side effects in Valeri, which is a pure language.
To explain it in simple terms, let's write a program that counts from 1 to 10:
(fn count (x n)
(when (<= x n)
(println x)
(count (+ x 1) n)
)
)
(count 1 10)
This program recursively calls the count function until it hits the depth limit of 10. On every call, it will execute println which outputs the current depth to the console. This is a simple and obvious program, that looks quite natural. But how can it output anything to the screen if Valeri is pure, and I/O is a side effect?
The answer is delimited continuations. Unfortunately, the linked Wikipedia article is a bit hard to understand because it uses the shift/reset approach, so I'll try to explain it in simpler terms.
Python implementation #
Since you likely can at least read Python, let's rewrite the same program in Python and gradually modify it to be pure.
def count(x, n):
if x <= n:
print(x)
count(x+1, n)
This still uses print, but it's fixable if we use generators:
def count(x, n):
if (x <= n):
yield x
yield from count(x+1, n)
computation = count(1, 10)
for io in computation:
print(io)
As you can see, print now is only present at the top level. It means that with this simple trick, we've turned count into a pure function that suspends its execution every time it needs to do IO and the control is passed to the place where the top-level count was called. The loop will then return the control back to where it was suspended. What is stored in the computation variable is called a "continuation", because you can resume its execution.
However, it's not so fun to be able to only print integers. Let's modify the code to also read the number we want to count to. For this to work, we need to distinguish between the types of IO operations:
# Creates a specific IO operation and yields it, to
# be catched by the outside IO loop
def IO(io_type, *params):
res = yield {"io_type": io_type, "params": params}
return res
def count(x, n=None):
if n is None:
# If "n" is not specified, read if from stdin
n = yield from IO("read")
n = int(n)
if (x <= n):
yield from IO("print", x)
yield from count(x+1, n)
computation = count(1)
while True:
try:
io = next(computation)
except StopIteration:
break
# Pick a specific IO action to execute
if io["io_type"] == "print":
print(io["params"][0])
elif io["io_type"] == "read":
computation.send(input())
This example is a bit more complex, but hopefully still understandable. We use a less-well-known send operation that allows to return a value back to the generator. When you do so, yield will have a return value which you can assign to a variable.
If you run the program, it will ask you for a value up to which it needs to count, and then print numbers between 1 and that value. If we imagine that the IO loop was part of the language runtime, and not part of our code, you'd have what is called an "effect" system. This effect system is quite powerful: you can for example record all IO that is happening during the execution and then replay it back to reproduce bugs. Or you can decide to batch certain IO operations (like "print").
Of course, with Python the pinch of salt would be its non-determinism and state mutability. There are many things that in practice can go differently between separate executions of the same algorithm if you're not careful.
Valeri implementation #
Valeri now provides roughly the same IO service out of the box. It works in a similar manner to Python:
Whenever an IO operations like
printlnis called, a special "task" object is createdCurrent execution is captured to a continuation, together with the task
The continuation object is thrown up the stack until it meets the first handler
If the handler is the top-level IO service, it takes the task and executes it
It then resumes the continuation by calling the continuation object with the IO result
So, if we try the following in the REPL:
valeri> (println "foobar") foobar
It is roughly equivalent to:
valeri> (raise (task 'print "foobar\n")) foobar
Here, the
raisefunction is a universal way to "capture" the delimited continuation and unwind the stack. In fact, raising errors works the same way, just the parameter toraisehas a different type:valeri> (fn myfun () (raise (error "some-error"))) #<function myfun> valeri> (myfun) Uncaught error: #<continuation> Function error Function myfun
The difference with Python is mostly in the convenience of not having to type
yield fromfor intermediate functions. This makes the code completely transparent and at the first glance work as if it has side effects.By using delimited continuations, you can implement many standard primitives like:
Coroutines
Exceptions
Global state
Green threads
Lazy sequences
… and much more
The current implementation in Valeri is not yet capable of doing this, but mainly because there's currently no way to "catch" the continuation in the user's code. Stay tuned for the updates.