Google benchmark has two ways of registering benchmarks and two ways of providing input to them.
One can register benchmarks with BENCHMARK
or with benchmark::RegisterBenchmark
(more info here).
If I want to give a benchmark function one number at a time, I can use ->Arg()
and friends. This is very limited but works for both methods of registering benchmarks. Or the benchmark function could accept extra arguments which would be provided by benchmark::RegisterBenchmark(name, function, arguments...). This works only with RegisterBenchmark
. (Hardcoding things could be a third way of providing arguments but I don't count it).
By trial and error (the documentation isn't very comprehensive), I came to these facts:
benchmark::RegisterBenchmark()
makes a copy of the given arguments- all registered benchmarks are accumulated until
benchmark::RunSpecifiedBenchmarks()
is executed benchmark::RunSpecifiedBenchmarks()
can be run only once
This isn't efficient. Let's say I have two benchmarks, A
and B
. Each of these should be run three times for three big std::vector<double>
s.
How I would do it (if I had the choice):
1. load std::vector<double> 1 (the loading can be expensive)
2. run benchmark A with std::vector<double> 1 passed by const reference (no copying)
3. run benchmark B with std::vector<double> 1 passed by const reference (no copying)
4. destruct std::vector<double> 1
5. load std::vector<double> 2 (the loading can be expensive)
6. run benchmark A with std::vector<double> 2 passed by const reference (no copying)
7. run benchmark B with std::vector<double> 2 passed by const reference (no copying)
8. destruct std::vector<double> 2
9. load std::vector<double> 3 (the loading can be expensive)
10. run benchmark A with std::vector<double> 3 passed by const reference (no copying)
11. run benchmark B with std::vector<double> 3 passed by const reference (no copying)
12. destruct std::vector<double> 3
As you can see, only one std::vector<double>
exists at a time. The loading of each vector can be expensive and shouldn't be repeated if a copy of the loaded data or a reference to it is suffitient.
How Google benchmark does it:
1. load std::vector<double> 1 (the loading can be expensive)
2. register benchmark A with std::vector<double> 1 passed by value (copying!)
3. register benchmark B with std::vector<double> 1 passed by value (copying!)
4. destruct std::vector<double> 1 (two copies still exist of it in step 2 and 3)
5. load std::vector<double> 2 (the loading can be expensive)
6. register benchmark A with std::vector<double> 2 passed by value (copying!)
7. register benchmark B with std::vector<double> 2 passed by value (copying!)
8. destruct std::vector<double> 2 (two copies still exist of it in step 6 and 7)
9. load std::vector<double> 3 (the loading can be expensive)
10. register benchmark A with std::vector<double> 3 passed by value (copying!)
11. register benchmark B with std::vector<double> 3 passed by value (copying!)
12. destruct std::vector<double> 3 (two copies still exist of it in step 10 and 11)
13. run all benchmarks with benchmark::RunSpecifiedBenchmarks() (this doesn't even destroy all the copies of the vectors after finishing)
14. run benchmark::Shutdown() (even this doesn't destroy the copies!)
15. reach end of main() - here are the destructors called
Here 6 copies of a std::vector<double>
(two copies of each of the three input vectors) exist in step 13 and later. This is a sharp contrast to the 1 vector in my way.
In my (ideal) situation, I pass data by const reference and therefore copy nothing. But this isn't possible in the real world because the original vectors could get destroyed before benchmark execution which would cause UB and the argument may be mutable which would mean modifying the original vectors which is undesirable.
A "compromise":
1. load std::vector<double> 1 (the loading can be expensive)
2. run benchmark A with std::vector<double> 1 passed by value (copying!) -> copy is destroyed after benchmark completion
3. run benchmark B with std::vector<double> 1 passed by value (copying!) -> copy is destroyed after benchmark completion
4. destruct std::vector<double> 1 (no vector 1 exists now)
5. load std::vector<double> 2 (the loading can be expensive)
6. run benchmark A with std::vector<double> 2 passed by value (copying!) -> copy is destroyed after benchmark completion
7. run benchmark B with std::vector<double> 2 passed by value (copying!) -> copy is destroyed after benchmark completion
8. destruct std::vector<double> 2 (no vector 2 exists now)
9. load std::vector<double> 3 (the loading can be expensive)
10. run benchmark A with std::vector<double> 3 passed by value (copying!) -> copy is destroyed after benchmark completion
11. run benchmark B with std::vector<double> 3 passed by value (copying!) -> copy is destroyed after benchmark completion
12. destruct std::vector<double> 3 (no vector 3 exists now)
AFAIK this wouldn't cause problems even if the benchmark function would want to mutate its argument.
This is impossible to do because benchmark::RegisterBenchmark()
accumulates all benchmarks until benchmark::RunSpecifiedBenchmarks()
. This isn't a problem in and of itself but as I said in fact 3, benchmark::RunSpecifiedBenchmarks()
can be run only once (I've tried it and a second call of this function results in a segfault).
The vectors are large. All six of them don't fit into my RAM and swap at once. How can I solve this?