How do I use cuda-gdb with a g++ linked program that uses an nvcc compiled static library?

Question

I'm working on an nvcc compiled static library for a g++ linked project. How do I use cuda-gdb on the final executable? All I get is "Program exited normally" without any printf output or anything.

nvcc is definitely being given the -g -G arguments when compiling the static library.

Here is my command line buffer:

cuda-gdb /home/sean/cuda-workspace/cudasplat/Debug/cudasplat 
NVIDIA (R) CUDA Debugger
5.0 release
Portions Copyright (C) 2007-2012 NVIDIA Corporation
GNU gdb (GDB) 7.2
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/sean/cuda-workspace/cudasplat/Debug/cudasplat...done.
(cuda-gdb) set args -t a1-31 a2-31 a3-31 a4-31 -L 30 -o tx_coverage -d /var/www/userman/plot-temp/ -db -85 -ngs -dbm -R 20
(cuda-gdb) run
Starting program: /home/sean/cuda-workspace/cudasplat/Debug/cudasplat -t a1-31 a2-31 a3-31 a4-31 -L 30 -o tx_coverage -d /var/www/userman/plot-temp/ -db -85 -ngs -dbm -R 20
[Thread debugging using libthread_db enabled]
Exiting...

Program exited normally.
(cuda-gdb)

This is what normally happens without debugging:

/home/sean/cuda-workspace/cudasplat/Debug/cudasplat -t  a1-31 a2-31 a3-31 a4-31 -L 30 -o tx_coverage -d /var/www/userman/plot-temp/ -db -85 -ngs -dbm -R 20
            --==[ Welcome To CUDASPLAT! HD v1.4.0a ]==--

Loading "51:52:113:114-hd.sdf" into page 1... Done!
Loading "50:51:113:114-hd.sdf" into page 2... Done!
Loading "50:51:114:115-hd.sdf" into page 3... Done!
Loading "51:52:114:115-hd.sdf" into page 4... Done!
copying 444 mb into device memory (3878 mb free)
finished copy
min_north 50, max_north 52, min_west 113, max_west 115
allocated antenna memory
invalid argument in ../cudapath.cu at line 551

Your program had a normal exit. You issued the run command in the debugger, so it ran until it hit a breakpoint (you didn't set any) or else a program termination. I guess your question is around the program output? Are you passing the same args when you run at the command line vs. the set args command in cuda-gdb? (are any of those args redirecting output?) The other possibility is that your program is experiencing a termination before it outputs anything, in the debug case. Maybe set a breakpoint at main and then step through to the first formatted output and see what happens. — Robert Crovella, Nov 28 '12 at 18:08
It appears that it's not executing the program with the arguments I give it. When I set breakpoints on functions that should be called based on the arguments provided none of those breakpoints are hit. — Sean, Nov 28 '12 at 20:12
not sure what is happening there. When I launch cuda-gdb with a simple program that is expecting 2 arguments, and I do `set args 10 10`, I get expected results, likewise if I don't set arguments I get an expected error message printed out. Since you're not even getting your Welcome To CUDASPLAT message (presumably that is early in your code?) perhaps you can step through to there rather than setting breakpoints. You may get a better clue as to what is happening. — Robert Crovella, Nov 28 '12 at 20:37

score 2 · Accepted Answer · answered Nov 28 '12 at 19:08

2

You should set a breakpoint before issuing a run command.
Does your application perform proper error checking? Note that cuda-gdb may "hide" GPUs used to render you OS graphical interface. E.g. if you have a single GPU system and run CUDA application from cuda-gdb in windowing environment (such as GTK or KDE) you application may fail because no GPUs will be detected.

answered Nov 28 '12 at 19:08

Eugene

9,242
2
30
29

Could the application fail during regular runtime due to running in a window manager? – Sean Nov 28 '12 at 19:58
1

Probably not. If a tool like deviceQuery can "see" the GPU, then there should be no issue with any other program as well for ordinary (non-gdb) running of an application. I run cuda-gdb on a RHEL 6.2 laptop running Gnome, and it seems to work fine from a terminal session within the GUI. In this single-GPU scenario, however, be careful about setting breakpoints in device code, they will probably hang the GUI. For these situations I use a different methodology, where I use a non NVIDIA GPU for display (i.e. X) and keep the NVIDIA CUDA GPU excluded from X. – Robert Crovella Nov 28 '12 at 20:18
I did some digging and you're right, that's the problem I'm having. Looks like I'm losing part of a day to get a second video card on the go : \ – Sean Nov 28 '12 at 20:35
another approach is to set the machine to runlevel 3 and ssh into it, assuming the app you are trying to debug does not use X. – Robert Crovella Nov 28 '12 at 20:48
1

@Sean Note that there's also an issue of watchdog timer - if your kernels take a long time to complete, they may be terminated prematurely if ran on a device used to drive display. I don't know the exact timeout length - its several seconds. – Eugene Nov 28 '12 at 21:03
[This answer](http://stackoverflow.com/questions/13525530/the-launch-timed-out-and-was-terminated/13525785#13525785) may help with understanding the watchdog timer options, especially the nvidia custhelp article linked in the comments. – Robert Crovella Nov 28 '12 at 21:07

How do I use cuda-gdb with a g++ linked program that uses an nvcc compiled static library?

1 Answers1