Konstantin Nazarov

Implementing globals

Global variables in dynamic languages are usually implemented differently from static ones. For example, consider this C code:

const int globalvar = 42;

int foo() {
  return globalvar;
}

Here, the compiler would know that globalvar is defined, and can even insert a direct pointer to it during the linking phase. This is because by its design, a static language would usually require that all references be known in advance. Otherwise it would be impossible to determine their type and check some of the memory safety guarantees.

Many dynamic languages would use a different strategy. Especially in a REPL. Consider for example this Python REPL session:

>>> def foo():
...     return globalvar
...
>>> globalvar = 42
>>> foo()
42

As you can see, Python has no problems with compiling foo even though the variable has not been defined yet. We can then define the variable and call foo and everything will work. This is because Python checks that globalvar is not a local variable in context of foo and inserts a dynamic lookup that tries to find this variable by name in a dictionary of globals.

For languages with a REPL, it is very convenient to be able to define and re-define top-level global variables and have all the rest of your functions pick them up without having to re-compile.

REPL is a form of image-based development, just a tiny bit less powerful. In my language, I'm trying to build something that is familiar to Lisp developers: an ability to dynamically re-evaluate blocks of code from the editor, sending them to a running process.

For that, I've implemented support in the virtual machine and assembly for addressing global variables. Ideally they should be module-scoped, but for now I only have one global scope, so it would work for a while. Here's an example in the assembler of how to use globals:

;; Declare a global symbol
(global foo)
(const forty-two 42)

(sr 2)

;; Set the global to a constant
(setglobal foo forty-two)

;; Make sure the value of the global
;; is correct
(getglobal r0 foo)
(aeq r0 forty-two)

(retnil)

In this case, (global foo) is not an actual opcode, but an instruction to the assembler that it should add foo to an array of globals. The virtual machine uses a few optimizations, so that it doesn't perform dictionary lookups every time, but only on first access.

So far, it does pretty much what Lua and Python do, and it doesn't look like anything special. But it would be important later on, when I'll add ability to load multiple pieces of bytecode and have them call each other.