1

I am trying to perform empirical analysis of the time complexity of a data set of about 1000 Codes. I have annotated them manually (how does the algorithm scale with respect to the size of input), and now I am trying to regress timing data against my complexity equation Y=C+log X + X + X log X + X^2 + X^3 + X^4 + e^X.

As a metric for execution time of the program, I am currently using the number of byte codes executed during program execution. This count can by found by the -XX countBytecodes Flag in Hotspot JVM. I also tried running with and without -Xint flag in hotspot JVM which forces the JVM to execute all bytecode in interpreted mode.

However the problem I am facing is, that the execution counts change when I run the same program twice. And due to this change is timing data, the regression results change. Furthar, when I run two different instances of the same code, The variation is huge in the verdict. I am not using timing data from System.currentTimeMillis/Nanos for the same reason. (I ran the data 6 times sequentially for about 500 codes and 3 times in parallel in batches of 160 codes. The variation was huge. Even the correlation between these values was random ranging from 1.00 to .37 and even small negatives in some cases.)

So I am looking for alternate apporaches, which can be used as a metric for the running time of the program. The only constraint is that this metric should have a direct relationship with the performance of the code, and should give same count when the code is executed multiple times. (or atleast the correlation should be close to 1).

I do not want to run timing data multiple times and take mean/median count because that is slowing things down to a great extent.

And is there any other profiler that could give me the count of the number of byte codes executed, during execution of a code?

A related question by me.

I just want the execution counts to be same if two codes are identical, not otherwise. And my sample has trivial codes to solve small programming puzzles where huge hand optimizations are not possible. Its is possible to gain some improvement over other implementation, but mostly the big-O remains the same (and even if it varies is cool, as long as the execution counts show that). What I dont want is, that in the first run, I get some execution counts, which get categorized as linear, and running again (the same identical code), gets different execution counts, which get categorized as a different big-O.

Community
  • 1
  • 1
aichemzee
  • 91
  • 1
  • 6
  • 1
    I would be very surprised if `number of bytecode instructions` would correlate to `execution time`. As Others in your related question stated, you best shot would be to use some benchmarking tool like JMH – Absurd-Mind Aug 11 '14 at 12:47
  • The problem is, the running time, or count of byte codes executed, can differ by one or two orders of magnitude for the same code, depending on how carefully it has been hand-optimized. (Compiler optimization is fine, but it cannot compare to aggressively hand-optimized code. [*Here's an example.*](http://scicomp.stackexchange.com/a/1870/1262)) I don't know how you can compare time of one code against another in the presence of such heavy constant factors. – Mike Dunlavey Aug 11 '14 at 12:49
  • @MikeDunlavey I just want the execution counts to be same if two codes are identical, not otherwise. And my sample has trivial codes to solve small programming puzzles where huge hand optimizations are not possible. Its is possible to gain some improvement over other implementation, but mostly the big-O remains the same. What I dont want is, that in the first run, I get some execution counts, which get categorized as linear, and running again, gets different execution counts, which get categorized as a different big-O. – aichemzee Aug 12 '14 at 05:59
  • Just guessing, but are you trying to classify the big-O notation of a submitted programming puzzle? And then reject all answers which are for example slower than `O(n^2)`? If yes, than take a look at [domjudge](http://www.domjudge.org/) they execute each submission and test if it was fast enough – Absurd-Mind Aug 12 '14 at 11:24
  • @Absurd-Mind yes I am trying to classify big-O for programming puzzles. But I am not trying to run an online Judge. I want to increase my accuracy in guessing the complexity. Currently I stand close to 70%, and I want to push as far as I can. But this variation in timing data is messing things up real bad. – aichemzee Aug 12 '14 at 13:06
  • 1
    What you could do: take big data samples, and count the executed bytecodes, make sure that you disable all optimizations and use `-Xint`. Be aware that this will not result in a 'time complexity' but in a 'algorithm size complexity'. As long as you treat this strictly as a 'Bytecodes per input' the regression could work. Also keep in mind that similar code can result in different bytecode counts. Next problem: using data structures like `List` can yield a penalty in comparison to arrays, although the difference in time could be negligible. see this: http://stackoverflow.com/a/13812169/562363 – Absurd-Mind Aug 12 '14 at 13:19

0 Answers0