I am trying to profile a network with torch.autograd.profiler and I need some explanation regarding the CPU and GPU time reported. I assume that the timings are nearly equal because CPU time includes the time the kernel launch + execution. However I see inconsistency when trying to find a relationship between CPU and GPU times.
As shown below some ops report approximately same time , some ops report CPU time larger than GPU time and some have the CPU time smaller than GPU time. Could someone please explain the difference in time.
- OpName CPUTime GPUTime
- relu 14.700us 15.936us
- sub 112.447us 93.504us
- mm 43.501us 46.912us
- CatBackward 84.912us 84.704us
Thanks