Purity, delimited continuations and I/O
In this post I'll explain how I deal with I/O and side effects in Valeri, which is a pure language.
To explain it in simple terms, let's write a program that counts from 1 to 10:
(fn count (x n)
(when (<= x n)
(println x)
(count (+ x 1) n)
)
)
(count 1 10)
This program recursively calls the count
function until it hits the depth limit of 10. On every call, it will execute println
which outputs the current depth to the console. This is a simple and obvious program, that looks quite natural. But how can it output anything to the screen if Valeri is pure, and I/O is a side effect?
The answer is delimited continuations. Unfortunately, the linked Wikipedia article is a bit hard to understand because it uses the shift/reset approach, so I'll try to explain it in simpler terms.
Python implementation
Since you likely can at least read Python, let's rewrite the same program in Python and gradually modify it to be pure.
def count(x, n):
if x <= n:
print(x)
count(x+1, n)
This still uses print
, but it's fixable if we use generators:
def count(x, n):
if (x <= n):
yield x
yield from count(x+1, n)
computation = count(1, 10)
for io in computation:
print(io)
As you can see, print
now is only present at the top level. It means that with this simple trick, we've turned count
into a pure function that suspends its execution every time it needs to do IO and the control is passed to the place where the top-level count
was called. The loop will then return the control back to where it was suspended. What is stored in the computation
variable is called a "continuation", because you can resume its execution.
However, it's not so fun to be able to only print integers. Let's modify the code to also read the number we want to count to. For this to work, we need to distinguish between the types of IO operations:
# Creates a specific IO operation and yields it, to
# be catched by the outside IO loop
def IO(io_type, *params):
res = yield {"io_type": io_type, "params": params}
return res
def count(x, n=None):
if n is None:
# If "n" is not specified, read if from stdin
n = yield from IO("read")
n = int(n)
if (x <= n):
yield from IO("print", x)
yield from count(x+1, n)
computation = count(1)
while True:
try:
io = next(computation)
except StopIteration:
break
# Pick a specific IO action to execute
if io["io_type"] == "print":
print(io["params"][0])
elif io["io_type"] == "read":
computation.send(input())
This example is a bit more complex, but hopefully still understandable. We use a less-well-known send
operation that allows to return a value back to the generator. When you do so, yield
will have a return value which you can assign to a variable.
If you run the program, it will ask you for a value up to which it needs to count, and then print numbers between 1 and that value. If we imagine that the IO loop was part of the language runtime, and not part of our code, you'd have what is called an "effect" system. This effect system is quite powerful: you can for example record all IO that is happening during the execution and then replay it back to reproduce bugs. Or you can decide to batch certain IO operations (like "print").
Of course, with Python the pinch of salt would be its non-determinism and state mutability. There are many things that in practice can go differently between separate executions of the same algorithm if you're not careful.
Valeri implementation
Valeri now provides roughly the same IO service out of the box. It works in a similar manner to Python:
- Whenever an IO operations like
println
is called, a special "task" object is created - Current execution is captured to a continuation, together with the task
- The continuation object is thrown up the stack until it meets the first handler
- If the handler is the top-level IO service, it takes the task and executes it
- It then resumes the continuation by calling the continuation object with the IO result
So, if we try the following in the REPL:
valeri> (println "foobar")
foobar
It is roughly equivalent to:
valeri> (raise (task 'print "foobar\n"))
foobar
Here, the raise
function is a universal way to "capture" the delimited continuation and unwind the stack. In fact, raising errors works the same way, just the parameter to raise
has a different type:
valeri> (fn myfun () (raise (error "some-error")))
#<function myfun>
valeri> (myfun)
Uncaught error: #<continuation>
Function error
Function myfun
The difference with Python is mostly in the convenience of not having to type yield from
for intermediate functions. This makes the code completely transparent and at the first glance work as if it has side effects.
By using delimited continuations, you can implement many standard primitives like:
- Coroutines
- Exceptions
- Global state
- Green threads
- Lazy sequences
- ... and much more
The current implementation in Valeri is not yet capable of doing this, but mainly because there's currently no way to "catch" the continuation in the user's code. Stay tuned for the updates.