3

I had a little surprise after profiling my program with oprofile.

The profile shows that my program spend 85% in _fini:

CPU: Intel Core/i7, speed 1199 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000 samples % image name symbol name 553519 85.7402 eddic _fini

The other symbols looks good.

My program is compiled with GCC 4.7.

From what I understand, the symbol _fini is a deprecated construct for global destruction, so I don't get why my program spend so much time in this symbol.

Can that result from a bad configuration of oprofile or of GCC ?

I tried to profile the unoptimized code and the problem is not present. There are no such symbol with no optimizations.

Is there a way to fix the profile or to avoid spend so much time in _fini ?

I can't paste my paste as it very very long, I didn't isolate the problem.

Thanks for any idea

Yamaneko
  • 3,433
  • 2
  • 38
  • 57
Baptiste Wicht
  • 7,472
  • 7
  • 45
  • 110
  • What disk activity is happening during this time? Maybe it is finishing up writes. Also, if a lot of memory is used it can be releasing pages. – stark Aug 01 '12 at 22:03
  • Some profilers let you ignore functions you are not interested in profiling. Alternatively, you could modify your code to use `_exit()` when you want to profile, but none of your global destructors will run. – jxh Aug 01 '12 at 22:25
  • Yes, I could try to ignore functions, but I would like to know if there is really time spent in _fini, if there is a display error or if it comes from other factors. – Baptiste Wicht Aug 06 '12 at 16:15
  • Can you just paste your code? Maybe there is something strange in class destructors. or maybe it's just a problem with profiler! – PLuS Aug 08 '12 at 22:44
  • I didnt use that profiler so I dont know how it displays the results, but mind that comparing with unoptimized code which had higher timings of other parts of the code may make little sense. The fini timing could be the same as in optimized, but maybe it was so small relatively to the unoptimized rest that it just didnt show up? – quetzalcoatl Aug 08 '12 at 23:43
  • If I exclude symbol _fini in oprofile, the percentages looks much better. In my opinion it is a bug of oprofile or a misuse by my side. – Baptiste Wicht Aug 09 '12 at 20:44

1 Answers1

5

without seeing the code in question is really hard to suggest where the problem lies, but the _fini time suggests destructors for global variables (or static function variables which also exist for the duration of the program). I would suggest - that you inspect the classes of all your global+static variables and see what their destructors are doing - comment out functionality in your program until that stops happening to give you a hint of where it is spending its time - Use gdb or another debugger to inspect what is happening at _fini.

csd
  • 934
  • 5
  • 12
  • I tested my code with gdb and it almost spend no time in _fini. If I break in _fini, it breaks at the very end of my program run, like 10 ms before the end of the program. – Baptiste Wicht Aug 09 '12 at 20:40
  • 2
    could it be then that your program symbols are not complete and oprofile is misreporting code that is past _fini but has no public symbols as being part of _fini ? – csd Aug 10 '12 at 00:28
  • Yes this is possible, but how would I know ? – Baptiste Wicht Aug 10 '12 at 07:47
  • 1
    Hard to say. Ideally you recompile/relink with all symbols and use a non-stripped image. If that is not possible an alternative is to find out where _fini really ends using gdb or objdump and feed that into the profiling – csd Aug 10 '12 at 15:18
  • To find out whether you're "past `_fini`", you can use `objdump -d` to disassemble the program and find `_fini` in the output. Then just read the assembly to determine whether this "falls through" to some other code. Also use `nm` to check what the actual reported codesize for `_fini` is, and whether offsets reported by the profiler are really still within. – FrankH. Aug 13 '12 at 21:18
  • @csd That is exactly what happened in my case. I built without debug info and ran the program and ran `gprof`, and I got 70% CPU in `_fini`. I rebuilt with debug info, reran, and reran `gprof`, and got what I expected: no time in `_fini`. My with and without debug info builds have different optimization options. – doug65536 Jan 28 '14 at 12:10