A GDB Primer

Cengiz Akinli

 

This is a VERY quick and dirty primer on using GDB to debug the basic C++ projects in CS2604 and similar courses.

 

GDB is great because it lets you see the innards of your program after it has crashed OR while it is still running. You can stop the program at any point and even under specific conditions only, and take a look at variable values. Then, you can walk it forward line by line, watching variable values to see what it's doing, and even run functions at any time, even though they're not called by your code.

 

I'm not even gonna try to write an exhaustive howto even on just the commands we would use in our simple projects. Instead, I'm just gonna give as concise a description as possible of those commands, and how we're most likely to use them. The purpose of this document is to give those of you who are an hour away from the deadline and have no idea what's wrong with your program a way to get up and running with GDB as fast as possible to solve what, with GDB at your disposal, could quickly turn into a trivial problem.  GDB is a very robust debugging tool, and what I'm describing here is a tiny portion of what it can do.  The debugger in MS VS.net is like a toy compared to GDB.  Now, it takes time to learn, but that's what any powerful system would require.  Don't let MS convince you that ease-of-use equals power.  If that were true, we'd all be learning to program in BASIC right now, wouldn't we?

 

If you've got time to delve deeper, the online help is far more useful than what I've written here if you have a working knowledge of the program, and they do a far better job than I ever could. Problem is, if you don't have a working knowledge, the online help may was well be Greek. Well, this document will hopefully give you that working knowledge so that when you do have time, you can go learn more and actually get something from the help.

 

Look at this document as more of an application-oriented (the application being the debugging of programs written for class) tutorial.

 

Alright, here we go.

 

Note about this document: Shell commands will appear in blue, whereas gdb commands will appear in red. Both will include prompts for clarity which you do not type.  Shell commands are preceded with a $, and GDB commands with a (gdb), and all computer input and output is in monospace fonts. Shell commands are for csh (I don't know bash well). To run csh type:

     $ csh

 

1. Rebuilding with debugging symbols

 

Before you can do any of this, you have to have debugging symbols in your executable.  This is done at compile time by adding the -g switch to the gcc/g++ compile command.  So,

$ g++ -c -o somefile.o somefile.cpp

becomes

$ g++ -c -g -o somefile.o somefile.cpp

 

Rebuild everything using commands like that, then link it all together to make your executable as normal.

 

2. Start GDB

 

You can use GDB in one of two ways:

 

1. To examine the core file.
This is the innards of a dying program dumped into a file for later analysis. Very handy. This is often referred to as post-mortem debugging. To do it, type:
$ limit coredumpsize unlimited
This allows your dying program to write a core file in case your default environment isn't already setup this way.

Then, run your program, and after it dies (it should say that it dumped core), do
$ gdb myprog core
Where
myprog is your program's executable filename.

 

2. To examine a running program. This, in turn, can be accomplished in one of two ways:

a. Attaching to an already running instance of your program.
This is good if your program HAS to read input from a file or do other stuff that's not easy to do from within GDB.

To do this, figure out your program's process ID, then start GDB with,
$ gdb myprog
Then, within GDB, do
(gdb) attach PID
where PID is your program instance's process ID.

b. Running the program from withing GDB.
Startup gdb with,
$ gdb myprog
Then, if you want it to stop at a particular location within the program, in GDB, do
(gdb) break somefile.cpp:123
where somefile.cpp:123 is the source code file and line number where you want GDB to stop the program.  Note that it will stop the program just BEFORE executing that line of code, not after.  Finally, run the program with,
(gdb) run

 

2.1  Debugging a crashing program

   Ok, so now we're in GDB and examining either a running program or a core file.  The first thing GDB is gonna tell you is the line number where the program is currently stopped (if you have a live program), or where it crashed, if you're examining a core file.

 

   Unless it's a really simple program, the first thing you will want to do is get a backtrace of the execution stack.  This is done with,

 

(gdb) backtrace

   or

(gdb) bt

   for short.

The output will look something like this example:

(gdb) bt

#0  0x40084ea8 in chunk_alloc (ar_ptr=0x40119d60, nb=16) at malloc.c:2875

#1  0x400845ce in __libc_malloc (bytes=7) at malloc.c:2696

#2  0x40089a29 in __strdup (s=0xbfffdc4f "(null)") at strdup.c:43

#3  0x804c706 in mycc_stat (ct=0xbffff874, icc_conn=0xbffffb18)

    at invoice.c:415

#4  0x80554c0 in batchclose (icc_conn=0xbffffb18) at bclose.c:158

#5  0x8054eba in close_batch (icc_conn=0xbffffb18) at bclose.c:33

#6  0x804a97e in main () at icc.c:96

 

Now, this is a stack.  Remember what that means.  It's a LIFO.  It tells me that my main() function was at line 96 in file icc.c, executing function close_batch(), which got to line line 33 in file bclose.c, where it executed function batchclose(), which got to line number 158 in that same file, where it executed function mycc_stat(), which got as far as line 415 in file invoice.c, where it executed function __strdup(), etc.

 

But __strdup() is not my function.  The program continued down into the bowels of system code until the actual operation (a memory write, usually) illicited the signal.  But the last function (counting from the bottom of the stack going up) that is in my code is mycc_stat().  I now know that my program died in function mycc_stat() at line 415 of file invoice.c.  So that's where I'm looking first.  I need to go to that stack frame and poke around at variable values and code and try and figure out what happened.  I go to that stack frame with,

 

(gdb) frame 3

 

Now, I can start poking around.  I can list source code with the list command or what's probably better is to just bring up the code in another window.  Immediately after giving the frame command, GDB tells me what line number we're on, and lists that line of code:

 

#3  0x804c706 in mycc_stat (ct=0xbffff874, icc_conn=0xbffffb18)

    at invoice.c:415

 

415             ct->dbtype = strdup(ptr2);

 

Ok, so my first guess is usually that I've done something bad with a NULL pointer.  It's not necessarily the MOST likely thing, but it's the easiest to find, so I check that out first.  Either ct is NULL, and my attempting to dereference it died, or perhaps ptr2 is NULL, and strdup() choked on it.  Here, I have my first clue.  In C/C++, the RHS of an assignment statement is evaluated THEN assigned to the LHS.  Well, remember, my program died INSIDE the strdup() call.  So my guess is that ptr2 is either NULL, or points off into space somewhere that my program wasn't allowed to read from.  So let's see which:

 

(gdb) print ptr2

$1 = 0xbfffdc4f "(null)"

 

Aw crap.  Well, it isn't NULL.  It points to 0xBFFFDC4F (that's the hexadecimal representation of the offset in the program's memory block to which this guy points).  Don't let the "(null)" confuse you.  That's an actual string used by the program.  If ptr2 had been NULL, I'd have gotten:

 

$1 = 0x0

 

But, I'm not that lucky.  There's no obvious reason it should be crashing here.  Now what?

 

Well, the news isn't all bad.  There is a good place to go for help.  The odds are surprisingly good that it's crashing here because at some other point, my program overflowed another pointer and overwrote some bit of data and/or executable instruction code.  If it had tried to write to address 0x0 (NULL), then that would've illicited a segfault from the OS immediately.  But instead, it may have written some place proper, and just written too much.  The problem only cropped up here because we tried to access that data or code just now.  But it would seem I have no way of knowing when that happened.

 

Or do I?

 

Well, there was a time when people in this situation actually WERE screwed.  But now, we have a whole class of tools called malloc() debuggers.  The malloc() system call is the basic memory allocation call used to allocate memory.  We used it for the memory manager whether we knew it or not (new and delete actually call it).  It allocates memory, and returns the address (sound familiar?) of the start of the memory block.  But unlike a self contained memory manager, it leaves it to us to do the writing.  It allocates how ever many bytes we ask for and gives us back the starting address, trusting us not to write more bytes than we requested.  So what happens if we do?  Well, funny things first, then death.  But oftentimes, the death comes many dozens or hundreds of lines of code later, leaving us guessing where the misbehaving line of code actually was.

 

So malloc() debuggers help us out.  Instead of giving us blocks of memory that are sequential, which is what the system malloc() call does, they replace the system malloc() with one that reserves a block of memory before and after each block they allocate for us.  This way, if we write past the end (overflow) or out the front end of the block (underflow), we go into that protected page and illicit the segfault immediately,  and presto!, the errant line of code can hide itself no more!  So the debugger doesn't stop the program from crashing, but rather makes it crash when it should, rather than later on or not at all (which is even worse).  This is great for us, as students.  Because if our code does something screwy, we have the added risk that it MAY NOT crash at ALL with our own test data.  So we feel confident and hand in a broken program.  Instead, malloc() debuggers help us flush out these problems before the TA's do.

 

Ok, so there's no shortage of these malloc() debuggers flying around the net.  I like ElectricFence.  You need to go get it, build it, and install it.  You can install it in your own account on the lab machines, or on /usr/lib on your own machine.  In your account, I would recommend making a directory called lib off your home directory and putting it there.  That way in the future, you can put other libraries there for reuse, including your own code while developing future programs.  Then, you need to link against this library.  You need to add -lefence to the end of your g++ link command.  If you're doing this in your lab account, you also need to add your ~/lib directory to your library search path with the -L option (if you're on your own machine, and you installed ElectricFence in one of your /lib, /usr/lib, or /usr/local/lib directories, you don't need to do this).  So for me, my linking step changes from

 

$ g++ -o myprog somefile.o someotherfile.o main.o

to

$ g++ -L/home/ugrads/c/cakinli/lib -o myprog somefile.o someotherfile.o main.o -lefence

 

Note that I can't use the ~ abbreviation here because I can't have a space after -L, which will keep the shell from expanding out the ~ alias.

 

Alright, so now we have a program that should crash where it went wrong instead of much later.

 

But what if this doesn't do it for us?  What if it's not crashing it all?

 

 

2.2  Debugging a misbehaving program

 

Well, we know how to load the program into GDB and run it.  We know about the break command.  Now we'll learn a few more to track down those annoying data errors.  What we're gonna do is trace the program line-by-line, or block-by-block, and examine variables at each step to figure out what's going on.  So we start with the break command.  Decide where we want the program to stop and kick us out to the GDB command line.  Do that now, then run the program.  Maybe feed it input, do whatever, and whammo, GDB stops the program where you told it to, and you're now staring at a (gdb) prompt.  What are some things you can do?

 

print

Print the value of an expression, be it a primitive type or a pointer to any type.  In the case of a pointer, the pointer address is printed and, if possible, the value stored there.

e.g.

(gdb) print someVar

 

printf

Execute a complex print statement using a format string with conversion specifiers exactly like the ANSI C printf() call.

e.g.

(gdb) printf "x = %d\ny = %d\nf = %5.2f", x, y, z

 

next

Execute the current line of code and move on to the next.

 

step

Step into the current line of code, if a function call, and stop before the first line of execution within that function.

 

continue

Continue execution until the next breakpoint or end of program is reached.

 

until

Continue execution until a line numbered higher than this one is reached.  Typically, you do this at the end of a loop to run the rest of the loop, then stop again.

 

finish

Continue execution until the current stack frame (current function) returns.

 

break

Insert a breakpoint at a specified line in the current file or another specified file.

e.g.

(gdb) break 415

(gdb) break somefile.cpp:251

 

Note also that all of the commands which control program flow take an optional numerical argument to specify essentially how many times in a row the command should be executed.  So let's say you're in a loop, and you've set a breakpoint at the fifth line of the loop, but you know that the line of data that's causing problems is the tenth piece of data.  You might do:

 

(gdb) continue 8

 

to let the program continue through the next 8 iterations of the loop, passing right through the breakpoint 8 times before you stop execution and start looking at things line-by-line.  This applies to the next, step, continue, until, and finish commands.

 

This is pretty handy, and you'll use it a lot.  It won't always be the first piece of data that chokes your program.  Sometimes, it'll be exactly the 513 iteration of a particular for loop that crashes, but only in the fourth call to the function.  You can figure out how many times a particular breakpoint is crossed before you get to that point and give the appropriate continue command to get there.  Then you can keep rerunning the program from the beginning with a new run command until you figure out what's happening.

 

And lastly, you only need to type enough of a command to distinguish it from the others (sort of).  So next, step, continue, and until can all be abbreviated as n, s, c, and u respectively.

 

 

 

 

3.  A few cool tricks

 

3.1  Function calls

Ok, so you can use the GDB printf command to print several things out formatted.  But it's a GDB command that just works like the printf() call.  So what if you could actually call printf()?  Well, you can.  You can either use the call command, or enter a function as part of an expression whose value you want to print with either the print or printf GDB commands:

call printf("x = %d\ny = %d\nf = %5.2f", x, y, z)

print myList->isEmpty()

print myList->current

 

The first command is equivalent to the printf GDB command we used earlier except that it calls the actualy printf() system call to do the job instead.  The second command actually executes the isEmpty() function in myList.  The third prints out a private data member of myList-- that's right-- private.  When running a C++ program with GDB, you are basically God.  You can do anything, call anything, and see anything you want.  You can even jump over lines of execution in a function (see the jump command).

 

So then, could you, per se, add a debugging function to a class, like a linked-list class, that would print out the entire list, head to tail, but would never be called by your program, just so that you could call it from within GDB?

 

The answer is yes.  You could do that for ANY class or complex data structure.  The print and printf GDB commands are only good for primitive types or more complex objects that are small and can be referenced on a line or two of code.  But now that you can call entire functions, you can write these debugging functions into your code and call them from within the debugger with any arguments you want.  This is an immensely powerful concept.  Think about that for a minute.

 

With this, you can easily see any data structure in your program at any time without having to type more than a single command from within GDB.  You can really see the innards of your program step-by-step.

 

3.2 Conditional execution, tracepoints, etc

That's about it.  The only other thing you need to know is how to find out more.  The answer is the help command.

 

By itself, you'll get a list of categories containing other commands.  You can type help followed by a category name for a list of commands in that category, or by a specific command for the help page for that command.  Here are some specific help pages you may wanna check out:

 

You can stop at a breakpoint only if a certain condition is met.  See the condition command.

 

You can set some GDB commands to be executed automatically at a breakpoint.  See the commands command.

 

You can set breakpoints that don't actually stop the program, but just do stuff when they're reached.  These are called tracepoints.  See the tracepoints help page.

 

 

Now you know how to get started debugging your program with GDB.

 

I hope you found this useful.  If so, kindly email Prof. McPherson and let him know that he should give me an 'A.'

 

-Cengiz