My CUDA nvprof 'API Trace' and 'GPU Trace' are not synchronized - what to do?

Question

I'm using the CUDA 7.0 profiler, nvprof, to profile some process making CUDA calls:

$ nvprof -o out.nvprof /path/to/my/app

Later, I generate two traces: the 'API trace' (what happens on the host CPU, e.g. CUDA runtime calls and ranges you mark) and the 'GPU trace' (kernel executions, memsets, H2Ds, D2Hs and so on):

$ nvprof -i out.nvprof --print-api-trace --csv 2>&1 | tail -n +2 > api-trace.csv
$ nvprof -i out.nvprof --print-gpu-trace --csv 2>&1 | tail -n +2 > gpu-trace.csv

Every record in each of the traces has a timestamp (or a start and end time). The thing is, time value 0 in these two traces is not the same: The GPU trace time-0 point seems to signify when the first operation on the GPU triggered by the relevant process begins to execute, while the API trace's time-0 point seems to be the beginning of process execution, or sometime thereabouts.

I've also noticed that when I use nvvp and import out.nvprof, the values are corrected, that it to say, the start time of the first GPU op is not 0, but something more realistic.

How do I obtain the correct offset between the two traces?

If you ask for both at the same time, I don't see any "offset". `nvprof --print-gpu-trace --print-api-trace ./my_app` Are you doing something different? Do you see something different? Is there some reason you can't ask for both at the same time from `nvprof` ? [Here](http://pastebin.com/7z2mFhbB) is an example of what I see. Note that I've marked a few lines for comparison with <<...>> — Robert Crovella, Apr 09 '15 at 21:09
how about `nvprof -i out.nvprof --print-gpu-trace --print-api-trace --csv 2>&1 | tail -n +2 > comb-trace.csv` — Robert Crovella, Apr 09 '15 at 21:34
@RobertCrovella: Hmmph. Whaddaya know. Make that an answer please? — einpoklum, Apr 09 '15 at 21:40

score 3 · Accepted Answer · edited Apr 10 '15 at 06:34

It may not be obvious from the nvprof documentation, but it is possible to specify both --print-gpu-trace and --print-api-trace when requesting output from nvprof, whether you are profiling an app or extracting information from a previously captured profiler output file.

If you are profiling an app, the following should generate a "harmonized" timeline for both API activity and GPU activity:

nvprof --print-gpu-trace --print-api-trace ./my_app

You can save the output using the --log-file option.

Similarly, if you are extracting output from a previously captured output file (not the same thing as a log file), you can do something like the following:

nvprof -i profiler_out_file --print-gpu-trace --print-api-trace ...

where profiler_out_file should be the name of the file you previously saved using the nvprof -o ... option.

Printing both traces with the same command is essential here for the two (combined) timelines to begin at the same point in time; if you issue two commands, each printing another trace, they may not be thus 'harmonized'.

My CUDA nvprof 'API Trace' and 'GPU Trace' are not synchronized - what to do?

1 Answers1