Questions tagged [microbenchmark]

A microbenchmark attempts to measure the performance of a "small" bit of code. These tests are typically in the sub-millisecond range. The code being tested usually performs no I/O, or else is a test of some single, specific I/O task.

Microbenchmarking is very different from profiling! When profiling, you work with an entire application, either in production or in an environment very painstakingly contrived to resemble production. Because of this, you get performance data that is, for lack of a better term, real. When you microbenchmark, you get a result that is essentially fictional, and you must be very careful about what conclusions you draw from it.

Still, for either type always apply the old adage:
Premature optimization is the root of all evil.

485 questions
14
votes
1 answer

Direct ByteBuffer relative vs absolute read performance

While I was testing the read performance of a direct java.nio.ByteBuffer I noticed that the absolute read is on average 2x times faster than the relative read. Also if I compare the source code of the relative vs absolute read, the code is pretty…
Vladimir G.
  • 418
  • 3
  • 10
14
votes
1 answer

Why does python's timeit use the 'best of 3' to measure the time elapsed?

I do not see the rationale why python's timeit module measures the time using the best of 3. Here is an example from my console: ~ python -m timeit 'sum(range(10000))' 10000 loops, best of 3: 119 usec per loop Intuitively, I would have put the…
zell
  • 9,830
  • 10
  • 62
  • 115
14
votes
1 answer

System.arraycopy with constant length

I'm playing around with JMH ( http://openjdk.java.net/projects/code-tools/jmh/ ) and I just stumbled on a strange result. I'm benchmarking ways to make a shallow copy of an array and I can observe the expected results (that looping through the array…
omiel
  • 1,573
  • 13
  • 16
13
votes
2 answers

Using SIMD on amd64, when is it better to use more instructions vs. loading from memory?

I have some highly perf sensitive code. A SIMD implementation using SSEn and AVX uses about 30 instructions, while a version that uses a 4096 byte lookup table uses about 8 instructions. In a microbenchmark, the lookup table is faster by 40%. If…
Tumbleweed53
  • 1,491
  • 7
  • 13
13
votes
1 answer

Is branch prediction not working?

In reference to this question, the answer specify that the unsorted array takes more time because it fails the branch prediction test. but if we make a minor change in the program: import java.util.Arrays; import java.util.Random; public class…
MYK
  • 825
  • 12
  • 25
12
votes
5 answers

How to build and link google benchmark using cmake in windows

I am trying to build google-benchmark and use it with my library using cmake. I have managed to build google-benchmark and run all its tests successfully using cmake. I am unfortunately unable to link it properly with my c++ code in windows using…
mathguy
  • 153
  • 2
  • 8
11
votes
11 answers

First time a Java loop is run SLOW, why? [Sun HotSpot 1.5, sparc]

In benchmarking some Java code on a Solaris SPARC box, I noticed that the first time I call the benchmarked function it runs EXTREMELY slowly (10x difference): First | 1 | 25295.979 ms Second | 1 | 2256.990 ms Third …
Adam Morrison
  • 206
  • 3
  • 8
11
votes
1 answer

How to use Java 12's Microbenchmark Suite?

According to JEP 230: Microbenchmark Suite, there exists a microbenchmark suite built-in to Java 12. The JEP explains that it's basically JMH, but without needing to explicitly depend on it using Maven/Gradle. However, it doesn't specify how to go…
Jacob G.
  • 28,856
  • 5
  • 62
  • 116
11
votes
1 answer

Why is Arrays.copyOf 2 times faster than System.arraycopy for small arrays?

I was recently playing with some benchmarks and found very interesting results that I can't explain right now. Here is the benchmark: @BenchmarkMode(Mode.Throughput) @Fork(1) @State(Scope.Thread) @Warmup(iterations = 10, time = 1, batchSize =…
Dmitriy Dumanskiy
  • 11,657
  • 9
  • 37
  • 57
11
votes
1 answer

Control number of operation per iteration JMH

My current setup: public void launchBenchmark() throws Exception { Options opt = new OptionsBuilder() .include(this.getClass().getName()) .mode(Mode.Throughput) //Calculate number of operations in a time unit. …
Xitrum
  • 7,765
  • 26
  • 90
  • 126
11
votes
2 answers

What is faster in Ruby, `arr += [x]` or `arr << x`

Intuitively, the latter should be faster than the former. However, I was very surprised when I saw benchmark results: require 'benchmark/ips' b = (0..20).to_a; y = 21; Benchmark.ips do |x| x.report('<<') { a = b.dup; a << y } …
DNNX
  • 6,215
  • 2
  • 27
  • 33
11
votes
2 answers

microbenchmark as data frame or matrix

is there any way to transform output of function microbenchmark::microbenchmark into the data frame or matrix? For example v <- rnorm(100) m <- microbenchmark(mean(v), sum(v)) The output Unit: nanoseconds expr min lq mean median uq …
jjankowiak
  • 3,010
  • 6
  • 28
  • 45
11
votes
1 answer

How can I use JMH for Scala benchmarks together with sbt?

I have tried to use jmh together with sbt, but so far I have not managed to set it up properly so that .scala based benchmarks work. As the combination sbt + .java based benchmarks works, I tried to start from that base. I am using sbt…
Beryllium
  • 12,808
  • 10
  • 56
  • 86
10
votes
1 answer

R equivalent of microbenchmark that includes memory as well as runtime

Background: This is the "microbenchmark" package for R: https://cran.r-project.org/web/packages/microbenchmark/index.html The first line in the reference manual says that it is built for "Accurate Timing Functions". One problem with this is the…
EngrStudent
  • 1,924
  • 31
  • 46
10
votes
6 answers

How to minimize the costs for allocating and initializing an NSDateFormatter?

I noticed that using an NSDateFormatter can be quite costly. I figured out that allocating and initializing the object already consumes a lot of time. Further, it seems that using an NSDateFormatter in multiple threads increases the costs. Can there…
JJD
  • 50,076
  • 60
  • 203
  • 339
1 2
3
32 33