2

i'm analysing my code with gprof, and i get the following output:

   %   cumulative   self              self     total           
  time  seconds   seconds    calls  ms/call  ms/call  name    
 59.22      0.58     0.58       48    12.09    18.81  _GLOBAL__sub_I_inputHeight

inputHeight is a global variable. I have no idea what GLOBAL_sub_I_inputHeight means, but it is called 48 times, and consumes 60% (0,6) of the exection time.

Any ideas?

Jon V
  • 506
  • 1
  • 3
  • 21
  • 1
    What is the type of inputHeight? – Matt Phillips Dec 15 '11 at 18:38
  • 8
    GLOBAL_sub_I_inputHeight is a function, not a variable. Variables don't use any execution time, they use memory. – Steve Wellens Dec 15 '11 at 18:43
  • 1
    While that's probably true, @Steve, do you have any explanation for how the variable got transformed into a function? Jonathan, maybe some details about your build environment would be helpful, such as which compiler you're using and what your target platform is. – Rob Kennedy Dec 15 '11 at 18:51
  • If it's C++.. do you have a getter for inputHeight ? – KoKuToru Dec 15 '11 at 18:53
  • @SteveWellens, just to be pedantic, as you are most likely correct about the function, reading/writing variables themselves technically has non-uniform access time and thread locals and volatiles could actually cause the compiler to generate support code beyond just the read/write. – edA-qa mort-ora-y Dec 15 '11 at 18:59
  • 1
    You could fire up objdump and disassemble the function and have a look at it. Without a minimal example of the code leading to this, we cant tell much. – PlasmaHH Dec 15 '11 at 18:59
  • @RobKennedy i'm compiling on linux (Ubuntu) with g++ – Jon V Dec 15 '11 at 19:01
  • @KoKuToru no, but i have a variable inputWidth which is used the same number of times, and this does not show up in gprof – Jon V Dec 15 '11 at 19:02
  • 2
    `_GLOBAL__sub_I_inputHeigth` is the function generated by the compiler and is called before main to initialize the global variable. Why it takes so much time we can only guess. Maybe you should show some code. – rve Dec 15 '11 at 19:04
  • How is `inputHeight` defined? And where (scope, etc.)? This looks like it might be an initialization routine, but I can't see why an initialization routine would be called 60 times, nor how it would take 60% of the time. – James Kanze Dec 15 '11 at 19:06
  • the function call has disappeared after compiling directly in the commandline instead of with a makefile. now i cannot reproduce the problem (even with make) – Jon V Dec 15 '11 at 19:17
  • Have you tried running a test that lasts longer than a second? Mostly it sounds like sampling error. – stonemetal Dec 15 '11 at 19:27
  • The problem seems to be situated with OpenMP. inputHeight is called in a parallel for. – Jon V Dec 15 '11 at 19:37
  • @JonathanVervaeke: and yet your question didn't even **mention** OpenMP. Don't you think that might be relevant information? – jalf Dec 15 '11 at 21:26
  • @jail i had disabled the openmp part, so i thought it was not relevant. But i made a mistake by filling the profile file with >> instead of > (which is stupid, i know :)) – Jon V Dec 15 '11 at 22:28

2 Answers2

1

Lets have a look at some minimal example that behaves similar:

  1 #include <iostream>
  2
  3 int global_int = 42;
  4
  5 int main(int argc, const char *argv[])
  6 {
  7     std::cout << "global_int = " << global_int << "\n";
  8 }



(gdb) bt
#0  global constructors keyed to global_int() () at global.cxx:8
#1  0x0804875d in __do_global_ctors_aux ()
#2  0x080484ec in _init ()
#3  0x080486f9 in __libc_csu_init (argc=1, argv=0xbffff0c4, envp=0xbffff0cc) at elf-init.c:79
#4  0xb7d2069c in __libc_start_main () from /lib/libc.so.6
#5  0x08048591 in _start () at ../sysdeps/i386/elf/start.S:119

This backtrace should give us at least something to google. These are constructors for global values.

191 080486b1 <_GLOBAL__I_global_int>:
192  80486b1:       55                      push   %ebp
193  80486b2:       89 e5                   mov    %esp,%ebp
194  80486b4:       83 ec 18                sub    $0x18,%esp
195  80486b7:       c7 44 24 04 ff ff 00    movl   $0xffff,0x4(%esp)
196  80486be:       00
197  80486bf:       c7 04 24 01 00 00 00    movl   $0x1,(%esp)
198  80486c6:       e8 a6 ff ff ff          call   8048671 <_Z41__static_initialization_and_destruction_0ii>
199  80486cb:       c9                      leave
200  80486cc:       c3                      ret

The disassembly doesn't make us that much smarter here, it calls the function __static_initialization_and_destruction_0(int,int) which is then a bit more intresting:

171 08048671 <_Z41__static_initialization_and_destruction_0ii>:
172  8048671:       55                      push   %ebp
173  8048672:       89 e5                   mov    %esp,%ebp
174  8048674:       83 ec 18                sub    $0x18,%esp
175  8048677:       83 7d 08 01             cmpl   $0x1,0x8(%ebp)
176  804867b:       75 32                   jne    80486af <_Z41__static_initialization_and_destruction_0ii+0x3e>
177  804867d:       81 7d 0c ff ff 00 00    cmpl   $0xffff,0xc(%ebp)
178  8048684:       75 29                   jne    80486af <_Z41__static_initialization_and_destruction_0ii+0x3e>
179  8048686:       c7 04 24 d4 a0 04 08    movl   $0x804a0d4,(%esp)
180  804868d:       e8 9e fe ff ff          call   8048530 <_ZNSt8ios_base4InitC1Ev@plt>
181  8048692:       b8 50 85 04 08          mov    $0x8048550,%eax
182  8048697:       c7 44 24 08 20 a0 04    movl   $0x804a020,0x8(%esp)
183  804869e:       08
184  804869f:       c7 44 24 04 d4 a0 04    movl   $0x804a0d4,0x4(%esp)
185  80486a6:       08
186  80486a7:       89 04 24                mov    %eax,(%esp)
187  80486aa:       e8 61 fe ff ff          call   8048510 <__cxa_atexit@plt>
188  80486af:       c9                      leave
189  80486b0:       c3                      ret

The important part here is that there is a function registered at atexit() which resides at 0x8048550 and is named std::ios_base::Init::~Init() but only when the parameters to this functions are 0x1 and 0xffff. And this function is then also calling the ctor std::ios_base::Init::Init() which is initializing the global stuff needed to use std::cout.

Now we know what it is doing, but not why. Here someone with initimate gcc knowledge would know more, but my guess is that this is the way gcc makes sure that whenever some code that is called to initialize some global/static variables (which runs before main), also the ctor for the i/o subsystem is called, so that you can always use std::cout in there.

As to why this is called so often in your code, and why it takes so long, I can't tell much, maybe you can find out more installing some breakpoint in a debugger. Maybe it is also some wrong accounting to that function because of optimization.

PlasmaHH
  • 15,673
  • 5
  • 44
  • 57
1

when you run run a gprofiled program, it writes the data in a file, usually gmon.out. That file contains the addresses of the functions, not the names. When you read the data with the gprof utility, it resolves the names using the debugging information present in the program.

That means that if you run the program, recompile it and execute gprof without re-running the program, the output is meaningless.

rodrigo
  • 94,151
  • 12
  • 143
  • 190