Different processors and instruction set architectures have always fascinated me. As a child of the late 70s and 80s I wasn’t quite old enough to appreciate the difference between the 6502 and 6809 processors, but by the early 90s and the introduction of the PowerPC, particularly in the PowerMac 6100 it was clear that ISAs can come and go. What I didn’t appreciate at the time, and didn’t until the early 2020s, is the licensing model around ISAs. Plenty has been written on RISC-V and the opportunity for it to revolutionize the market, so we won’t repeat it here. All I can say is that I’ve had fun over the past couple of years supporting RISC-V through purchasing and tinkering with systems built using RISC-V processors.
Recently I had the good fortune of purchasing a brand new HiFive Unmatched and really wanted to stress it as a “daily driver” for building software natively on and for a RISC-V system. Why not use a cross-compiler? Indeed, my M1 MacBook Pro kicks the Unmatched’s ass, and even running benchmarks in a Docker container on the M1 is faster:
On an M1:
1 2 3 4 |
root@2972ef523083:~/primes_benchmark# ./primes_benchmark Starting run 3713160 primes found in 7268 ms 236 bytes of code in countPrimes() |
7.3 seconds.
On the HiFive Unmatched:
1 2 3 4 |
root@stormtrooper # ./primes_benchmark Starting run 3713160 primes found in 18981 ms 236 bytes of code in countPrimes() |
19 seconds.
QEMU RISC-V emulation vs. what was considered a top-of-the-line RISC-V processor, the SiFive Freedom U740 SoC. Indeed, it still might be.
But, this is all missing the point isn’t it? To propel RISC-V forward as an alternative we must move beyond using emulation and relying on cross-compilers, and push ourselves to “go native.” So that’s what we’ll do by compiling the LLDB debugger natively on the HiFive Unmatched. Why LLDB? Well, for whatever reason it wasn’t installed with apt-get install llvm
or apt-get install clang
on my machine, so I decided to take the plunge and compile it myself.
I strongly prefer to use “chroot jails” for building software. For starters, they are useful to ensure that dependencies are strongly tracked, and that “cruft” that accumulates over time doesn’t pollute your build environment.
Since our HiFive Unmatched is running Ubuntu 22.04 “Jammy”, it made sense to use debootstrap to create our jail. First, let’s install debootstrap
with apt-get install debootstrap
and then look how our filesystem is structured:
1 2 3 |
root@stormtrooper # mkdir -p /opt/loadbuild/jails root@stormtrooper # cd /opt/loadbuild/jails root@stormtrooper # debootstrap jammy build-lldb |
debootstrap
will create us a nearly fully functional environment for cross-compiling LLDB. Once it’s complete there are a few bind mounts we will need to set up.
1 2 3 4 5 6 7 8 9 10 |
root@stormtrooper # cd /opt/loadbuild/jails # Do the mounts root@stormtrooper # mount --bind /dev build-lldb/dev root@stormtrooper # mount --bind /proc build-lldb/proc root@stormtrooper # mount --bind /sys build-lldb/sys root@stormtrooper # mount -t devpts none build-lldb/dev/pts/ # chroot root@stormtrooper # chroot build-lldb |
From here, we are in the jail.
1 2 |
root@build-lldb # mkdir -p /opt/build root@build-lldb # mkdir -p /opt/src |
Now we’re ready to review the instructions for building LLDB. They’re pretty straightforward, and consist of the installing prerequisites, configuring your building tree, and typing make
. Well, in this case, ninja
. Your dependencies are the typical cast of characters for a Debian-based distribution: build-essential
for the compiler, libraries for text interfaces (libedit-dev
, libncurses5-dev
). LLDB comes with a scripting facility, and its choice for interfacing languages is SWIG.
In Ubuntu 22.04 the swig
package is in the “Universe” repository, so let’s add it:
1 2 |
root@build-lldb # sudo apt-get install -y software-properties-common root@build-lldb # sudo add-apt-repository -y universe |
Now we can install all of our dependencies for building LLDB.
1 2 3 4 5 6 7 8 9 |
root@build-lldb # sudo apt-get install -y build-essential \ swig \ python3-dev \ libedit-dev \ libncurses5-dev \ ninja-build \ cmake \ screen \ git |
Editor’s note: I’m aware of multistrap
but haven’t had time to research it.
1 2 3 4 |
root@build-lldb # cd /opt/src/ root@build-lldb # git clone https://github.com/llvm/llvm-project.git root@build-lldb # cd /opt/build root@build-lldb # cmake -G Ninja -DLLVM_ENABLE_PROJECTS="clang;lldb" -DLLVM_TARGETS_TO_BUILD=RISCV -DCMAKE_BUILD_TYPE=Release /opt/src/llvm-project/llvm |
A couple of comments on the build options. First, we’re only going to build lldb
and we’re only going to support the RISCV architecture (that is, our lldb
won’t be able to read any other architectures). Building lldb
requires clang
to be “enabled”, so LLVM_ENABLE_PROJECTS
must be set to clang;lldb
. Finally, to save some disk space, we’ll build the Release
version.
Now, before we build, create a screen
session. This is going to allow our us to get back to our build if ssh
hangs up on us.
1 2 3 4 |
root@build-lldb # screen -S build-lldb root@build-lldb # time ninja lldb lldb-server [1/3237] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/ARMAttributeParser.cpp.o |
And we’re off!
This will take some time. On my HiFive Unmatched it took 10 hours.
1 2 3 4 5 |
[3237/3237] Linking CXX executable bin/lldb real 573m15.878s user 2133m44.806s sys 114m43.002s |
Whew! A little over nine and a half hours to build lldb
. Let’s hope it works!
Testing
Let’s look at some basics of LLDB with a toy C application:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
#include <stdio.h> #include <stdlib.h> int add(int a, int b) { return a+b; } int* add_arrays(int* a1, int* a2, int len) { int* c = malloc(sizeof(int)*len); for (int i = 0; i < len; i++) { c[i] = a1[i] + a2[i]; } return c; } int main(void) { int a = 42; int b = 97; int a1[] = {1, 2, 3, 4, 5}; int a2[] = {6, 7, 8, 9, 0}; add(a, b); int* a3 = add_arrays(a1, a2, 5); for (int i = 0; i < 5; i++) { printf("%d + %d = %d\n", a1[i], a2[i], a3[i]); } return 0; } |
To get this in our debugger, compile with gcc -g -o toy toy.c
, or if you have clang
installed the command-line arguments are the same.
Okay, here we go. /opt/loadbuild/jails/build-lldb/opt/build/bin/lldb toy
. You can see I have not installed lldb
anywhere and its still in its “jail” directory, but we’re no longer in the jail (where it sits at /opt/build/bin/lldb
).
1 2 3 |
~ /opt/loadbuild/jails/build-lldb/opt/build/bin/lldb toy (lldb) target create "toy" Current executable set to '/home/joe/toy' (riscv64). |
We’ll make this brief, as there are a lot of LLDB tutorials online. First, let’s set a breakpoint on the “most interesting” function, add_arrays
. Hitting the TAB key after typing add
shows the useful completion feature.
1 2 3 4 5 6 |
(lldb) b add Available completions: add add_arrays (lldb) b add_arrays Breakpoint 1: where = toy`add_arrays + 20 at toy.c:9:31, address = 0x00000000000006ee |
Let’s run with r
(and yes I’m aware r
is a GDB command that LLDB has mapped to):
1 2 3 4 5 6 7 8 9 10 11 12 |
(lldb) r Process 35357 launched: '/home/joe/toy' (riscv64) Process 35357 stopped * thread #1, name = 'toy', stop reason = breakpoint 1.1 frame #0: 0x0000002aaaaaa6ee toy`add_arrays(a1=0x0000003ffffff2f0, a2=0x0000003ffffff2d8, len=5) at toy.c:9:31 6 } 7 8 int* add_arrays(int* a1, int* a2, int len) { -> 9 int* c = malloc(sizeof(int)*len); 10 11 for (int i = 0; i < len; i++) { 12 c[i] = a1[i] + a2[i]; |
Let’s dump the RISC-V registers!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
(lldb) register read General Purpose Registers: pc = 0x0000002aaaaaa6ee toy`add_arrays + 20 at toy.c:9:31 ra = 0x0000002aaaaaa7ca toy`main + 116 at toy.c:27:13 sp = 0x0000003ffffff270 gp = 0x0000002aaaaac800 tp = 0x0000003ff7fcd3e0 t0 = 0x0000003ff7fdd4c8 t1 = 0x0000003ff7febefc t2 = 0x0000003ff7fbf420 fp = 0x0000003ffffff2b0 s1 = 0x0000000000000001 a0 = 0x0000003ffffff2f0 a1 = 0x0000003ffffff2d8 a2 = 0x0000000000000005 a3 = 0x0000000000000000 a4 = 0x0000003ffffff340 a5 = 0x0000002aaaaaa756 toy`main at toy.c:18 a6 = 0x0000003ff7fbfdb0 a7 = 0x2f56405b0043434a s2 = 0x0000000000000000 s3 = 0x0000002aaaaabe18 toy`__do_global_dtors_aux_fini_array_entry s4 = 0x0000002aaaaaa756 toy`main at toy.c:18 s5 = 0x0000003ffffff4b8 s6 = 0x0000002aaaaabe18 toy`__do_global_dtors_aux_fini_array_entry s7 = 0x0000003ff7ffdd18 s8 = 0x0000003ff7ffe050 s9 = 0x0000003fc562ab40 s10 = 0x0000002ae41ff460 s11 = 0x0000003fc562ab30 t3 = 0x0000003ff7e9d9a0 t4 = 0x000000000000eefc t5 = 0x0000000000000003 t6 = 0xffffffffffffffff zero = 0x0000000000000000 |
I find it interesting that even the RISC-V zero
register is printed out, but then again, it is a register.
A few other quick commands. Since we’re in a frame (function) we can print the local variables.
1 2 3 4 5 |
(lldb) frame variable (int *) a1 = 0x0000003ffffff2f0 (int *) a2 = 0x0000003ffffff2d8 (int) len = 5 (int *) c = 0x0000000000000001 |
The arrays are passed as two integer pointers, and looking back at the registers we can see the RISC-V calling convention in action (a0, a1, and a2 have the function arguments). Let’s read the memory that a1 points to:
1 2 3 |
(lldb) memory read 0x0000003ffffff2f0 0x3ffffff2f0: 01 00 00 00 02 00 00 00 03 00 00 00 04 00 00 00 ................ 0x3ffffff300: 05 00 00 00 61 00 00 00 2a 00 00 00 00 00 00 00 ....a...*....... |
Nifty. We can even get naughty and write values into memory. Let’s write the value 42 at the second index of the array.
1 2 3 4 |
(lldb) memory write 0x0000003ffffff2f4 42 (lldb) memory read 0x0000003ffffff2f0 0x3ffffff2f0: 01 00 00 00 42 00 00 00 03 00 00 00 04 00 00 00 ....B........... 0x3ffffff300: 05 00 00 00 61 00 00 00 2a 00 00 00 00 00 00 00 ....a...*....... |
Hitting c
for continue
we can see our handiwork:
1 2 3 4 5 6 7 8 |
(lldb) c Process 35357 resuming 1 + 6 = 7 66 + 7 = 73 3 + 8 = 11 4 + 9 = 13 5 + 0 = 5 Process 35357 exited with status = 0 (0x00000000) |
Well, it looks like the memory write
command took its argument in base 16, but it worked all the same.
Closing
As I said at the start, there’s something about different instruction sets and CPU architectures that I find fascinating. History has shown that there is still a lot of room for innovation and competition, and RISC-V is proving that out each and every day. Is it ready to be your daily driver? Probably not, but that isn’t a function of the ISA but of the market and its evolution relative to other platforms. It will continue to make progress as companies like SiFive and StarFive bring processors and boards to the market. I have my sights set on the HiFive Pro P550 “Horse Creek” board, and hopefully we can cut some time off compiling LLDB!