Support for a long awaited GNU C extension, asm goto, is in the midst of landing in Clang and LLVM. We want to make sure that we release a high quality implementation, so it’s important to test the new patches on real code and not just small test cases. When we hit compiler bugs in large source files, it can be tricky to find exactly what part of potentially large translation units are problematic. In this post, we’ll take a look at using C-Reduce, a multithreaded code bisection utility for C/C++, to help narrow done a reproducer for a real compiler bug (potentially; in a patch that was posted, and will be fixed before it can ship in production) from a real code base (the Linux kernel). It’s mostly a post to myself in the future, so that I can remind myself how to run C-reduce on the Linux kernel again, since this is now the third real compiler bug it’s helped me track down.
So the bug I’m focusing on when trying to compile the Linux kernel with Clang is a linkage error, all the way at the end of the build.
Hmm…looks like the object file (
drivers/spi/spidev.o), has a
__jump_table), that references a non-existent
.Ltmp), which looks like a temporary label that should have been
cleaned up by the compiler. Maybe it was accidentally left behind by an
To run C-reduce, we need a shell script that returns 0 when it should keep
reducing, and an input file. For an input file, it’s just way simpler to
preprocess it; this helps cut down on the compiler flags that typically
requires paths (
First, let’s preprocess the source. For the kernel, if the file compiles
correctly, the kernel’s KBuild build process will create a file named in the
form path/to/.file.o.cmd, in our case drivers/spi/.spidev.o.cmd. (If the file
doesn’t compile, then
I’ve had success
make path/to/file.o with
then getting the
compile_commands.json for the file.) I find it easiest to
copy this file to a new shell script, then strip out everything but the first
line. I then replace the
-c -o <output>.o with
chmod +x that new
shell script, then run it (outputting to stdout) to eyeball that it looks
preprocessed, then redirect the output to a
.i file. Now that we have our
preprocessed input, let’s create the C-reduce shell script.
I find it helpful to have a shell script in the form:
- remove previous object files
- rebuild object files
- disassemble object files and pipe to grep
For you, it might be some different steps.
As the docs show,
you just need the shell script to return 0 when it should keep reducing. From
our previous shell script that pre-processed the source and dumped a
let’s change it back to stop before linking rather that preprocessing
s/-E/-c/), and change the input to our new
.i file. Finally, let’s add
the test for what we want. Since I want C-Reduce to keep reducing until the
disassmbled object file no longer references anything
Ltmp related, I write:
Now I can run the reproducer to check that it at least returns 0, which C-Reduce needs to get started:
1 2 3
Now that we have a reproducer script and input file, let’s run C-Reduce.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
So it took C-reduce just over 6 minutes to turn >56k lines of mostly irrelevant code into 12 when running 40 threads on my 48 core workstation.
It’s also highly entertaining to watch C-Reduce work its magic. In another
terminal, I highly recommend running
watch -n1 cat <input_file_to_creduce.i>
to see it pared down before your eyes.
Finally, we still want to bisect our compiler flags (the kernel uses a lot). I still do this process manually, and it’s not too bad. Having proper and minimal steps to reproduce compiler bugs is critical.
That’s enough for a great bug report for now. In a future episode, we’ll see how to start pulling apart llvm to see where compilation is going amiss.