So - I've been writing a language interpreter as a side project for a year now. Today I have finally decided to test its performance for the first time! Maybe I should have done that sooner... turns out running a Fibonacci function in the language takes x600 the time of the equivalent Python program. Whoopsy daisy.
Anyway... I'm off to profiling. In the call graph, gprof
regards a few functions (namely critical ones) as called from <spontaneous>
. It's a problem because understanding what calls these functions the most frequently will aid me.
I compile the project as a whole like so:
gcc *.c -o app.exe -g -pg -O2 -Wall -Wno-unused -LC:/msys64_new/mingw64/lib -lShlwapi
I use gprof
like so:
gprof app.exe > gprofoutput.txt
Since it's a language interpreter, many of these functions (all of them?) might be called as part of a mutual recursion chain. Is it possible that this is the problem? If so, is gprof
to be trusted at all with this program?
The functions called by <spontaneous>
are compiled as part of the *.c
files of the project, and are not called by an external library or anything that I know of.
Because I have checked this, the other answers here on SO about <spontaneous>
haven't solved my issue. What can be causing these functions to appear as called from <spontaneous>
and how can I fix this?
Example gprof
output (_mcount_private
and __fentry__
are of course irrelevant - including them here in case it grants any clues):
index % time self children called name
<spontaneous>
[1] 46.9 1.38 0.00 _mcount_private [1]
-----------------------------------------------
<spontaneous>
[2] 23.1 0.68 0.00 __fentry__ [2]
-----------------------------------------------
<spontaneous>
[3] 18.7 0.06 0.49 object_string_new [3]
0.17 0.24 5687901/5687901 cell_table_set_value [4]
0.00 0.08 5687901/7583875 make_native_function_with_params [7]
0.00 0.00 13271769/30578281 parser_parse [80]
-----------------------------------------------
0.17 0.24 5687901/5687901 object_string_new [3]
[4] 14.1 0.17 0.24 5687901 cell_table_set_value [4]
0.12 0.05 5687901/5930697 table_set_value_directly [6]
0.02 0.04 5687901/7341054 table_get_value_directly [9]
0.01 0.00 5687901/5930694 object_cell_new [31]
-----------------------------------------------
<spontaneous>
[5] 7.0 0.07 0.14 vm_interpret_frame [5]
0.01 0.05 1410341/1410345 cell_table_get_value_cstring_key [13]
0.01 0.02 242786/242794 cell_table_set_value_cstring_key [19]
0.02 0.00 3259885/3502670 object_thread_pop_eval_stack [22]
0.01 0.00 242785/242786 value_array_free [28]
0.00 0.01 242785/242785 vm_call_object [34]
0.00 0.00 681987/1849546 value_compare [32]
0.00 0.00 485570/31306651 table_init [20]
0.00 0.00 242785/242788 cell_table_free [38]
0.00 0.00 242785/25375951 cell_table_init [29]
0.00 0.00 1/1 object_load_attribute [50]
0.00 0.00 1/1 object_load_attribute_cstring_key [52]
0.00 0.00 1/2 object_user_function_new [56]
0.00 0.00 2/33884613 copy_cstring [17]
0.00 0.00 1/5687909 object_function_set_name [25]
0.00 0.00 1/17063722 copy_null_terminated_cstring [23]
0.00 0.00 1/72532402 allocate [21]
0.00 0.00 3502671/3502671 object_thread_push_eval_stack [81]
0.00 0.00 1167557/1167557 object_as_string [85]
0.00 0.00 681988/681995 two_bytes_to_short [86]
0.00 0.00 485572/485578 value_array_make [88]
0.00 0.00 242786/242786 object_thread_push_frame [96]
0.00 0.00 242786/242786 object_thread_peek_frame [95]
0.00 0.00 242785/242785 object_thread_pop_frame [97]
0.00 0.00 242785/485571 vm_import_module [89]
0.00 0.00 2/1167575 object_value_is [83]
-----------------------------------------------
..... etc .........
I'm running Mingw-w64 GCC on Windows 7.