Simple VM for dynamic languages #

A few weeks ago, I've started working on a lisp interpreter. I already did a few implementations of lisp in different languages, but those were mostly just recursive evaluators. This time, it's a bit more serious.

Instead of writing the implementation top-to-bottom, I started with a virtual machine. Virtual machines are used to execute most of the scripting languages, since bytecode is more compact and faster to evaluate. Compared to a tree-walker, bytecode VMs are better for branch prediction and friendlier to the CPU cache.

Because the VM is very barebones at the moment, and no language is written on top of it, I created a very simple assembly language. Here's an example of computing a factorial function in it:

li r1, 10
li r2, 1
li r0, 1
factorial:
    mul r0, r0, r2
    addi r2, r2, 1
    jle r2, r1, factorial

This code computes 10! and returns it in the r0 register.

The architecture of the VM is "load/store", meaning that computation (addition, multiplication, conditions, etc…) can only be performed on registers. Data can be loaded from the memory to registers with separate instructions. This contrasts a bit with how some VMs are implemented: for some reason many of them don't use registers at all, and instead rely only on the stack. For example, this is what a factorial function would look like in Python bytecode:

2     >>    0 LOAD_FAST                0 (N)
            3 LOAD_CONST               1 (1)
            6 COMPARE_OP               2 (==)
            9 POP_JUMP_IF_FALSE       16
3          12 LOAD_FAST               1 (result)
           15 RETURN_VALUE
4     >>   16 LOAD_FAST                0 (N)
           19 LOAD_CONST               1 (1)
           22 BINARY_SUBTRACT
           23 LOAD_FAST                0 (N)
           26 LOAD_FAST                1 (result)
           29 BINARY_MULTIPLY
           30 STORE_FAST               1 (result)
           33 STORE_FAST               0 (N)
           36 JUMP_ABSOLUTE            0

If you look carefully, you'd notice that there are no registers here. This is because all operations that write something, usually do so to the top of the stack.

There are a few problems I find with the stack machines: