0

I've got an auxiliary function that does some operations that are pretty costly.

I'm trying to profile the main section of the algorithm, but this auxiliary function gets called a lot within. Consequently, the measured time takes into account the auxillary function's time.

To solve this, I decided to set and restore the time so that the auxillary function appears to be instantaneous. I defined the following macros:

#define TIME_SAVE struct timeval _time_tv; gettimeofday(&_time_tv,NULL);
#define TIME_RESTORE settimeofday(&_time_tv,NULL);

. . . and used them as the first and last lines of the auxiliary function. For some reason, though, the auxiliary function's overhead is still included!

So, I know this is kind of a messy solution, and so I have since moved on, but I'm still curious as to why this idea didn't work. Can someone please explain why?

Donal Fellows
  • 133,037
  • 18
  • 149
  • 215
geometrian
  • 14,775
  • 10
  • 56
  • 132

2 Answers2

4

If you insist on profiling this way, do not set the system clock. This will break all sorts of things, if you have permission to do it. Basically you should forget you ever heard of settimeofday. What you want to do is call gettimeofday both before and after the function you want to exclude from measurement, and compute the difference. You can then exclude the time spent in this function from the overall time.

With that said, this whole method of "profiling" is highly flawed, because gettimeofday probably (1) takes a significant amount of time compared to what you're trying to measure, and (2) probably involves a transition into kernelspace, which will do some serious damage to your program's cache coherency. This second problem, whereby in attempting to observe your program's performance characteristics you actually change them, is the most problematic.

What you really should do is forget about this kind of profiling (gettimeofday or even gcc's -pg/gmon profiling) and instead use oprofile or perf or something similar. These modern profiling techniques work based on statistically sampling the instruction pointer and stack information periodically; your program's own code is not modified at all, so it behaves as closely as possible to how it would behave with no profiler running.

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
  • These days, `gettimeofday()` can often be handled without a transition into kernelspace (as a "vsyscall"). – caf Apr 29 '12 at 11:20
  • @caf: The only system I know that can do that is Linux/x86_64. As far as I know, all other OS's, and all other cpu archs under Linux, require a syscall. – R.. GitHub STOP HELPING ICE Apr 29 '12 at 15:36
0

There are a couple possibilities that may be occurring. One is that Linux tries to keep the clock accurate and adjustments to the clock may be 'smoothed' or otherwise 'fixed up' to try to keep a smooth sense of time within the system. If you are running NTP, it will also try to maintain a reasonable sense of time.

My approach would have been to not modify the clock but instead track time consumed by each portion of the process. The calls to the expensive part would be accumulated (by getting the difference between gettimeofday on entry and exit, and accumulating) and subtracting that from overall time. There are other possibilities for fancier approaches, I'm sure.

Art Swri
  • 2,799
  • 3
  • 25
  • 36