VM progress update: arrays and function calls

The virtual machine I'm working on can now allocate and access arrays from the bytecode and assembly. Arrays are one of the basic data structures in the VM, in addition to numeric types and pointers. Until now, I was only able to operate with arrays from unit tests, because memory allocation and garbage collection wasn't fully there.

Now, with garbage collection and memory allocation in place, the following is possible:


;; Allocate an array of 10240 untyped
;; elements and place a pointer to it
;; into register r1
arri r1, 10240

;; Load '42' into register r2
li r2, 42

;; Put the content of r2 into the array
;; pointed to by r1 with offset of 10000
asi r1, r2, 10000

;; Read the content of the array with
;; offset of 10000 to register r0
ali r0, r1, 10000

When saved as array.asm, you can byte-compile it and execute like this:


$ cat array.asm | ./asm | ./vm
R0: 42

As of now, R0 is printed by default when the virtual machine executes the last instruction.

Another interesting feature that is available now is "jump-and-link". It is based on the same concept in the RISC-V architecture, pretty much like x86 "call". It jumps to a specific offset and stores the address of the next instruction in the specified register. Here's an example:


begin:
    ;; Load '42' into register r2
    li r2, 42

    ;; Jump to label 'fun' and store
    ;; the address of next instruction in
    ;; the r3 register
    jal r3, fun

    ;; Continue execution after return from
    ;; 'fun'
    ;; Increment register r0
    addi r0, r0, 1

    ;; Jump to label 'end' unconditionally
    jmp end

fun:
    ;; Copy content of register r2 into
    ;; register r0
    mov r0, r2

    ;; Jump to location saved in r3
    jr r3

end:
    ;; End of program

And there are now also stack operation. They look and work pretty much like you would expect on normal hardware, except that the stack is also accessible as an array.


;; Load '42' into register r2
li r2, 42

;; Load '43' into register r3
li r3, 43

;; Push register r2 onto the stack
push r2

;; Update the stack's top element
;; to be the content of register r3.
;; The 'sp' is a register that contains
;; stack pointer. Negative offsets
;; are supported as well.
asi sp, r3, 0

;; Pop the top element of the stack
;; into register r0
pop r0

For the stack, I also have a "base pointer" which is supposed to serve as an address of the current call frame. It would be useful when the time comes to compile an actual language into the bytecode.

In addition to the new instructions, the virtual machine now loads the bundle of both code and data from the binary representation. Data section is needed for storing larger constant objects that I would need when implementing the programming language on top of it. For example, 64-bit integers, floating point constants, strings and others. The assembly compiler though currently just emits empty data section because it can't properly evaluate constants yet. This would be something I'll handle separately.

All in all, I'm quite happy with the progress. The code is a bit messy and ugly, but it is slowly making it into a better state as I add more functionality and do refactoring bit-by-bit.

After ironing out the basics, I would probably work on the dynamic linking and interaction with a C FFI. This is needed to give the code executing in the virtual machine ability to call external functions (like filesystem access, network and others).

As usual, the experimental code can be found here.