Adding a debugger and ELF loader to my RISC-V emulator

After getting the first version of my RISC-V emulator working, I thought it would be nice if I could debug a program running inside it via GDB. A quick research revealed that GDB has a thing called remote protocol. This protocol is text-based and allows the GDB debugger to interact with a remote target to step through the program, read registers, print values of variables, and everything else you would expect a debugger to do.

Under a usual desktop operating system, GDB would normally utilize the ptrace syscall to stop the process and read its memory. But this is doable because you run under a "shared" userspace where you have direct access to the app that you're debugging. This is not the case for microcontrollers, where a program runs on a completely different CPU. For this purpose, microcontroller boards have a separate JTAG port which can be used to debug them remotely.

This is why GDB has implemented a universal protocol, which you can implement to build a "bridge" to your specialized hardware. In my case, I decided to implement it for my emulator.

Quick intro to the GDB remote protocol

The remote protocol is text based, works over a socket, where commands are sent by the debugger. A typical command looks like this:

$packet-data#checksum

Here, packet-data contains the command to be executed. A checksum is just a sum of all characters modulo 256. You don't really have to check it if you're running over a reliable protocol like TCP where error correction is built-in.

Examples of commands:

$g# - get value of all available registers
$p<register-number># - get the value of specific register
$m<addr>,<size># - read memory at address addr, up to addr+size
... and a few more to set breakpoints and continue execution

One of the nice properties of the protocol is that you can implement only the bare minimum of commands, and add more of them later if you think they would be needed. For example, if you want to add hardware breakpoint support or support for threads, you can do so, but you're not obligated.

To report the features that your implementation has, you need to respond to the $qSupported# request with a comma-separated list of features.

Because the protocol is so simple, I could implement all the basics in one go, in about 2-3 hours. The best way for you to learn the details is by just looking at debug.cpp file in the repository.

How to actually run the debugger

I've added a command to the emulator to load the program and wait for debugger to be attached:

./rve --debug example/example

When you run it, the emulator would wait until you attach at port 1234, like this:

riscv32-none-elf-gdb

And then in the prompt, do the following:

file example/example
target remote :1234

After this, you can start debugging normally, like placing breakpoints, reading variable values, etc.

Automatically loading debug symbols

There was one small thing however that I've found annoying - the need to specify file example/example in the GDB prompt so that it loads the debug symbols of the program being debugged. Otherwise you'd just be able to read the disassembled instructions in memory.

I tried looking in the documentation whether there was a command in GDB protocol that allows it to figure out the path to the program being executed automatically. After all, I already pass the program path to the emulator in order to run it, so it obviously have this knowledge.

It turns out that there is such a command, and it's called $qXfer:exec-file:read:# and it does exactly what I want. The only thing is that I needed to report that the emulator supports this. But it was simple enough. I just needed to reply to $qSupported# with this line:

qXfer:exec-file:read+

After that, the first thing GDB will do when attaching is send qXfer:exec-file:read:. It will then use the file returned to load symbols and other debug information.

Loading ELF files

The first iteration of the emulator that I've written was capable of loading only the raw memory image of the program, because it was the easiest way forward. However, you can't pass the raw image to GDB for the purpose of loading symbols from it, because when you produce this raw image all debug symbols are gone. Usually in a normal desktop operating system you don't execute raw memory images, but instead load ELF files that contain program code, data and various other meta information.

To make GDB load debug symbols automatically, it should be passed an ELF file instead. So I had to write an ELF file loader myself, so that the emulator can just run ELF programs. I thought that I would have to use libelf, but after reading the Wikipedia article about ELF files, it turned out that the ELF file format is really simple.

In the ELF file there's just a fixed-size header followed by a variable number of sections. These things can be mapped to plain structures in memory, and you don't have to do any tricky deserialization. So the algorithm for loading the code is roughly:

Read ELF header, get number of sections
Iterate over every section following the header
See if section name is either .text or .sdata
If it is - read the offset to the blob from the section
Read data at that offset and add it to the resulting byte array

All in all, the code for doing that ended up being just under 150 lines of code. This is because for the purposes of the emulator I didn't have to load any other sections and didn't need to bother with dynamic linking (as the binary is fully statically linked).

Further work: reverse execution

There's a fun thing you can do inside an emulator, which you can rarely do on a normal CPU. It is time-travel debugging, or as GDB calls it - reverse execution. For it to work, the virtual machine needs to keep a log of every operation like writing to memory or registers. When GDB asks to run the program backwards, you just go and undo that log one item at a time.

This is simple and very fun to implement, so I'd likely do that in the future. Stay tuned.