1

I have a multi-threading application and when I run vtune-profiler on it, under the caller/callee tab, I see that the callee function's CPU Time: Total - Effective Time is larger than caller function's CPU Time: Total - Effective Time.

eg. caller function - A

callee function - B (no one calls B but A)

Function CPU time: Total
- Effective Time
A 54%
B 57%

My understanding is that Cpu Time: Total is the sum of CPU time: self + time of all the callee's of that function. By that definition should not Cpu Time: Total of A be greater than B?

What am I missing here?

yashC
  • 887
  • 7
  • 20

1 Answers1

3
  1. It might have happened that the function B is being called by some other function along with A so there must be this issue.

  2. Intel VTune profiler works by sampling and numbers are less accurate for short run time. If your application runs for a very short duration you could consider using allow multiple runs in VTune or increasing the run time.

  3. Also Intel VTune Profiler sometimes rounds off the numbers so it might not give ideal result but the difference is very small like 0.1% but in your question its 3% difference so this won't be the reason for it.

General Grievance
  • 4,555
  • 31
  • 31
  • 45
  • Can the reason be that this is a multi threaded application? (There are total of 131 threads created as shown on summary page, 56 thread of these call function A which calls function B). I tried with single thread and find that I am able to get the caller callee in correct order that is expected (for this particular case). – yashC Dec 11 '21 at 17:10