In the latest refactoring round of my code, I replaced a bunch of template classes with fixed number of template arguments with variadic counterparts. I was quite a bit puzzled to find out that a specific performance test case had seen a drop in performance of about 20-30%.
A few git bisect roundtrips later, the offending commit was identified. It literally consists of a single change from
template <typename T, typename U>
class foo {};
to
template <typename T, typename ... Args>
class foo {};
I have confirmed experimentally that applying this single change produces the slowdown mentioned above. Yet more puzzlingly, switching compiler version (from GCC 4.7 to GCC 4.8) moves the slowdown occurrence to another similar commit (i.e., another switch from fixed to variadic arguments, but in a different class bar
).
To give a bit of context, this specific performance test case is a very sparse computer algebra problem which is memory-bound and hence very susceptible to efficient cache memory utilisation. This test case has always been a problematic spot in my code (e.g., around GCC 4.4/4.5 I used to have to manually tweak the compiler options managing the detection of cache line sizes in order to extract max performance).
Does anybody have an idea of what could cause this behaviour? Unfortunately, I fear that extracting a reduced test case could be very difficult.
EDIT
For reference, this is the commit that restored good performance behaviour. Unfortunately it consists of the revert to non-variadic code for a bunch of classes (instead of just one class). I will try to come up with a more confined example.
https://gitorious.org/piranhapp0x/mainline/commit/b952c613b42fe480fe4ed2dfd3e683eb9e38e4cd