Testing my RISC-V implementation with RISCOF test suite

I've written my RISC-V emulator in a hurry and was pretty sure it had bugs. I was thinking about writing a test suite for it but quickly realized there would be many edge cases for which I would need to create test data. Given that there are about 40 opcodes in the base spec, it looked like testing would be much harder than writing the emulator itself.

So, I decided to look for an existing test suite from the community. It turns out there is one called RISCOF (RISC-V Compatibility Framework). It's a test runner based on the test suite in riscv-arch-test. It helps verify that the implementation doesn't have low-hanging-fruit type bugs: there are tests for all instructions and most of the typical cases where implementations may have issues.

How RISCOF runner works

It took me a while to understand how RISCOF is able to run the tests and extract the results. The documentation doesn't make it easier, because it's very sparse on details. You have to really try and bang your head against it a little bit.

In essence, the harness compiles each test for two different targets: one is your implementation, and the other is reference implementation. The test binary contains individual test cases, which check the opcodes under certain conditions. When each test case is run, it records its result in a special memory region. The set of results in this region is referred to as "signature". As soon as the test is finished, the emulator is expected to dump a region of memory with signature to a text file in the hex format.

To check if your virtual machine implementation is correct, its resulting signature is compared against the reference implementation, and if they are equal - then everything's good.

Extracting the signature

I expected the signature to be a constant size, but apparently it's not. And it can be located in different regions in memory, so I can just hardcode the location and let my virtual machine just dump it.

In order to find the signature, I had to write some code in the ELF loader to extract the location of two symbols: begin_signature and end_signature. These two delineate the memory region that should be dumped. To figure out that I need to do it, I had to study the code of the sail implementation, specifically its C emulator.

Types of bugs discovered

I wasn't disappointed by the results. The test suite did indeed find some serious bugs. Some of them were just me being careless, and some were misunderstandings of how the opcodes work. Here are examples:

lb instruction that loads single byte from memory into the 4-byte register needs to sign-extend the byte. Meaning that if the memory contains 0xff, the value in the register should be 0xffffffff and not 0x000000ff.
jalr opcode contains an immediate value that is 12-bit. It can be negative, in which case the highest bit is 1. But since it is smaller than 32-bit variable it's in, I need to manually sign-extend the result.
slt (set less than) had the result mixed up (0 instead of 1 and vice versa)

Of course these are all trivial bugs that can be noticed when running real programs. And I did in fact already fix a few of them that way. But it's really not that fun to debug your program opcode-by-opcode and trying to guess which one is misbehaving.

So if you're in the same situation as I was, give it a go. It will save you some time.