Build for debug using GDB
Here, we’re talking about how we build applications for best use inside GDB.
-g and -O
Let’s talk about what the options -g
and -O
do. The first thing you need to know is that they are completely orthogonal things.
The first option, -g
, compiles your program with the information to tell GDB what your program is doing. All the debug information goes in its own section in the executable or the object file and, importantly, it does not affect the code that is generated at all. On the other hand -O
does affect the generated code, but does not do anything with debug information.
If you are worried that adding -g
will slow down your program, then don't worry. This option will make your binary a bit larger on disk, but that's it. Your program will run as normal and when you're running the application, the operating system won't even page in the additional debugging information into memory.
However, when you compile with optimizations — with -O
- it can and often adversely affect your debug experience.
Optimized out?
Let’s look at an example using this tiny little program:
Compile it and run it inside GDB:
gcc -O3 -g optimized.c
gdb a.out
Now if you start
and then print the value of foo, you'll see:
(gdb) print foo
$1 = <optimized out>
Seen that before? You probably have.
It’s somewhat annoying — partly because the message is somewhat misleading as it suggests the compiler has been super-clever and the value of foo doesn’t exist at all in your program; perhaps it’s been optimized away? This isn’t really the case.
What this can mean is that the variable foo is not yet live at this point, even though notionally it’s in scope. In this example, we’re at the beginning of the line where this is declared, but space hasn’t yet been allocated for it.
So, if I type next
and then look at foo ... voilà, there is it:
(gdb) print foo
$2 = 1804289384
Inspecting with readelf
Let’s look at what’s going on here and how the debugger does what it does.
When we compile with -g
, we're actually generating dwarf information. (DWARF is a pun on ELF, which stands for Executable and Linking Format. DWARF goes with ELF.)
Aside from the humor, what you need to know is that it’s a bunch of information generated by the compile that the debugger uses to understand what your program is doing. At its simplest, this debug information might be telling you what line you’re on.
We can look at this using the readelf
utility.
readelf --debug-dump a.out | less
This shows a dump version of the DWARF info which the compiler stored in a.out
along with your program.
In this dump, I can search for my variable foo
and see that the DWARF information is basically a tree. From this screenshot, you can see that foo
is declared in file number 1 (the DW_AT_decl_file
line underneath the DW_AT_name : foo
line):
Following on from that, there is information about its line, column, type and location. The location tells us where the variable is live.
You can use the readelf utility to show the locations as so:
readelf --debug-dump=loc a.out | less
What this shows me is a list of places where this variable is live; and from that I see that the program counter offset (highlighted below) is live in the register rax
. It doesn't exist in memory, there's no address to look at, but we can see it's in the register.
And what follows is seeing the variable flow through the program.
If you return to GDB, go next
and then print the program counter with print $pc
, you will see the 074
in the location. (Watch the video above to see this in action.)
In short, <optimized out>
doesn't mean the variable isn't there. Sometimes you can step next a few times. If you have reversible capabilities, you can back up to where it's live.
-g is not the only option
It’s worth knowing that -g
is not the only option that you can pass to gcc or clang about how it's going to optimize data.
For example, while the default is -g2
, if I pass -g3
which is the highest that will generate more debug information. This will make the binary larger, but it won't of course affect the program itself.
This turns on basically everything and is likely to give you a better debugging experience.
Different versions of DWARF
They are different versions of the DWARF format. The default for modern versions of gcc (with -g2
) is to generate DWARF 4.
However, if you have to build on older systems to guarantee that the resulting application will run on older systems, then the compile will likely do a less good job at generating DWARF. This is because there’s a whole bunch of stuff it just can’t capture and you’ll see more “optimized out” annoyances when you’re debugging.
But if you do have a modern compiler, -g3
will generate more than DWARF 4 including things like macros.
For example, if you compile the following with gcc -g
:
…then run gdb a.out
, open in GDB, run start
and then do:
(gdb) print VAL
You’ll see that you’re told VAL
doesn't exist. This is because VAL
is a macro, so the compiler has never even seen it.
However, if you compile with -g3
and load into GDB, the same print line will show:
$1 = 42
Horrah! You can debug your macros.
Option -Og
As a final point, here’s how to think about balancing optimization with debugging experience.
You can specify -Og
will give you a good debug experience and remaining pretty fast.
Combined, the -g3 -Og
options give you nice balance of performance but is still nicely debuggable.
(The only reason not to use -g3
is if you're really sensitive to the size of your binaries on disk or compile time of your code.)
And there you go! Hopefully that’s given you some good tips and a deeper understanding of how you build your program for debugging with GDB.
Originally published at https://undo.io.