Cycles consumed in each function through Oprofile

Question

Oprofile works on Sampling Based theory. Opreport -l option provides us the profiling report in the following way:

samples  %        image name               symbol name
78149    15.0776  cvqa                     comp_corr.clone.2

With this information I can know the %age of time consumed in consumption. If I do some optimizaion in my code I will again get the report as:

samples  %        image name               symbol name
73179    15.0732  cvqa                     comp_corr.clone.2

In this report I am not getting how much optimization of cycles has been done so that I can benchmark. How much optimization has been done till now?

Is there any way we can know how much cycles optimization has been done or any other way through which I can bench mark?

I am working on AMD64 bit machine.

score 0 · Answer 1 · edited Apr 13 '17 at 12:53

Since your real goal is to optimize the program, let me suggest another way to think about it.

The main thing to measure is overall time, not cycles or times of the various routines.

Now, here's how to do optimization. Don't base it on any measurements. Rather, get a number of samples of the program's state and (this is the key point) study each sample closely enough, with your own eyes and brain, and understand what the program is doing in that state, and the full reason why it is doing it. (You will see anything worth fixing that statistics could reveal, plus things they could not reveal, and that makes all the difference.)

As soon as you catch it in the act of doing, on two or more samples, something that could be removed, fixing it will give you a substantial speedup. Here is an explanation of why it works and how much speedup you can expect. After you do that, you can do the overall time measurement again and see how much time you saved.

Then don't stop. Do it again. You'll find something else to fix, which is now a bigger percent because of the first problem you removed.

In my experience, with real software, this can be done as many as 5 or 6 times, after which the program can be orders of magnitude faster than it was originally. The reason is because each optimization removes a fraction of the original execution time, and those fractions can accumulate up to nearly 100%. I'm not aware of any such result achieved with Oprofile or any other profiler tool.

Thanks for your suggestion. However, I am more interested in knowing the total code execution time or total cycles consumed by the whole code. Is there any way I can get that? even if there is a way through gprof you can suggest me that — Manish Kumar, Feb 13 '13 at 07:29

Cycles consumed in each function through Oprofile

1 Answers1