strace, valgrind and gdb as a strace-like tool

Fri 30 July 2010

A long time since I do not post some technical thing. Time to share something I learnt during latest months.

My boss at ProFUSION always says there are only 2 debugging tools worth using: strace and valgrind. Indeed, they are very helpful while developing software in C. If you don’t know them, hurry! Google it. Basically what I can do is to summarize their functionality:

  • strace traces all system calls a program does. You don’t even need the source code. Suppose a software is misbehaving and you want to know what files it is opening: all you have to do is to intercept calls to your kernel, i.e. trace all open syscalls. You have to do as following:

    bash $ strace -e trace=open ./your-program-to-trace

    This will output all the system calls of type open, with their parameters;

  • valgrind helps you to debug problems related to memory accesses. It’s a common mistake to free an alloc’d variable and after trying to access its content. Valgrind will tell you exactly where you freed the memory an then you may catch the bug much more easily

However I don’t think gdb is not worth using and I’ll show here it’s so powerful that you can even simulate strace with gdb. Let’s say you want to do the same thing as above with strace. You’ll need to use the ‘catch’ command in gdb:

 (gdb) catch syscall open

Done! Your program will stop for every open syscall. But wait, strace does not break, it just prints the syscall and continues exection. Let me continue… One of the features gdb has that I see really few people using is commands. Hey! GDB is not a tool made only for seeing the backtrace when a program reaches a certain line in your source code! You can automate several things, print memory locations, break if a variable assumes a certain value etc. So, if you just want to print something, you can improve our previous example:

(gdb) catch syscall open
(gdb) commands 1
> bt
> continue
> end

Now it will print the backtrace every time an open syscall is issued. If you pay attention to the calling convention of your architecture[1] [2], you could check the parameters to that syscall and print its value or stop if it is something interesting. For example, in x86_64 the following snippet would print the filename being opened:

(gdb) catch syscall open
(gdb) commands 1
> silent
> x/s $rdi
> continue
> end

Yes, you could use strace and grep for the file you are interested in. But, where’s the fun ;-)? The thing is that all these debugging tools rely on ptrace syscall, that allows a process to trace another one. So, choose your debugging tool and go fix your code :-)

blogroll

social