Best method to count byte codes executed for a Java code

Question

I was trying to get timing data for various Java programs. Then I had to perform some regression analysis based on this timing data. Here are the two methods I used to get the timing data:

System.currentTimeMillis(): I used this initially, but I wanted the timing data to be constant when the same program was run multiple times. The variation was huge in this case. When two instances of the same code were executed in parallel, the variation was even more. So I dropped this and started looking for some profilers.
-XX countBytecodes Flag in Hotspot JVM: Since the variation in timing data was huge, I thought of measuring the number of byte codes executed, when this code was executed. This should have given a more static count, when the same program was executed multiple times. But This also had variations. When the programs were executed sequentially, the variations were small, but during parellel runs of the same code, the variations were huge. I also tried compiling using -Xint, but the results were similar.

So I am looking for some profiler that could give me the count of byte codes executed when a code is executed. The count should remain constant (or correlation close to 1) across runs of the same program. Or if there could be some other metric based on which I could get timing data, which should stay almost constant across multiple runs.

I have no idea. Why do you want to do this? VisualVM is a decent profiler. But the data you're trying to collect is meaningless after the JIT optimizes your methods, so I have no idea how you could collect your data. — Elliott Frisch, Aug 04 '14 at 08:18
@ElliottFrisch I want to rate codes based on execution time performance, and compare programs with each other, cluster the data based on programs having similar performance, etc. — aichemzee, Aug 04 '14 at 08:22
Why do you want to exclude the fact that specific timings vary from your measurements? Would this not defeat their purpose. — Drux, Aug 04 '14 at 08:27
@ElliottFrisch I am using the `-Xint` flag, which forces the JVM to execute all bytecode in interpreted mode. So those optimization bits should not occur. As a result this should work right? — aichemzee, Aug 04 '14 at 08:29
@Drux I am not excluding the fact, I am just making sure that two codes which are exact copies of each other, give same timing data when executed. — aichemzee, Aug 04 '14 at 08:30
@aichemzee If you have a non-trivial code base then caching, network performance, etc., could explain various differences. But then I don't know your exact project of course. — Drux, Aug 04 '14 at 09:19

Peter Lawrey · Answer 1 · 2014-08-04T08:33:20.540

4

I wanted the timing data to be constant when the same program was run multiple times

That is not possible on a real machine unless it is designed for hard real time system which your machine will almost certainly be not.

I am looking for some profiler that could give me the count of byte codes executed when a code is executed.

Assuming you could do this, it wouldn't prove anything. You wouldn't be able to see for example that ++ is 90x cheaper than % depending on the hardware you run it on. You won't be able to see that a branch miss of an if is up to 100x more expensive than a speculative branch. You wouldn't be able to see that a memory access to an area of memory which triggers a TLB miss can be more expensive than copying 4 KB of data.

if there could be some other metric based on which I could get timing data, which should stay almost constant across multiple runs.

You can run it many times and take the average. This will hide any high results/outliers and give you a favourable idea of throughput. It can be a reproducible number for a given machine, if run long enough.

edited Aug 04 '14 at 08:33

answered Aug 04 '14 at 08:27

Peter Lawrey

525,659
79
751
1,130

I understand this will not be the case, and the variation was huge like I said. So I changed the metric from execution time to number of byte codes executed. They should give a direct relationship to timing data, and should almost be constant when the same program was run multiple times. – aichemzee Aug 04 '14 at 08:34
@aichemzee The concern is that even if you get a reproducable byte code byte count, it won't tell you anything about the performance or timing of the applications e.g. how many bytes in `Thread.sleep(100000);` or `System.gc();` – Peter Lawrey Aug 04 '14 at 08:39
1

Yes, it would not. But I am rating codes based on how they scale when input size is varied. So such statements would either contribute to the constant term, or change the coefficient of the timing function, not the timing function itself. (timing function for eg t(N)=C_1*N+C_2, so C_1 and C_2 will be effected, and the timing function predicted will still show linear scaling) – aichemzee Aug 04 '14 at 08:44
@aichemzee so you want to measure the program's time complexity empirically? I could write a program which is always O(1) for any realistic input by always taking a long time. ;) Wouldn't it be simpler to just measure how long it runs for based on a number of samples. This is how most benchmarking tools like JMH work. – Peter Lawrey Aug 04 '14 at 08:48
Well you could write a program that shows O(1) always. But the samples I have, thought the bench-marking would be done manually, so the codes are clean. :D I am going to run an average analysis across multiple runs, and look at the results. Thanks by the way. But yes running multiple times is going to slow things down, which is bad. – aichemzee Aug 04 '14 at 09:05
@aichemzee running in interpreted mode and adding byte code counts across threads is also going to be pretty slow. ;) – Peter Lawrey Aug 04 '14 at 09:10
yes, a slowdown factor of 10X tops. What if the execution counts are not stable after 10 executions? :) (Let me run the experiments first to reach a better number). – aichemzee Aug 04 '14 at 09:14
@aichemzee Instead of taking an average, you could take the median. This is a more reproducible number as it ignores outliers. I think you will find that counting byte code would be at least 100x slower. – Peter Lawrey Aug 04 '14 at 09:16
I tried taking median, but running timing data multiple times is slowing things down. Is there any alternative that could be used as a metric for performance? Or any other profiler that could directly give some value that is proportional to performance? – aichemzee Aug 11 '14 at 10:10
@PeterLawrey "That is not possible on a real machine" — Sure it is. With simple code like `void inc() { counter++; }`, the number of bytecodes executed will always be 7 (aload, dup, getfield, iconst_1, iadd, putfield, return). Counting these can indeed be useful for measuring time complexity. This takes time and is not useful for measuring code in production, but [MMIX](http://mmix.cs.hm.edu/) even simulates the timing of machine instructions. – Roland Illig Jun 07 '17 at 10:12

Best method to count byte codes executed for a Java code

1 Answers1

Linked