3

In the latest refactoring round of my code, I replaced a bunch of template classes with fixed number of template arguments with variadic counterparts. I was quite a bit puzzled to find out that a specific performance test case had seen a drop in performance of about 20-30%.

A few git bisect roundtrips later, the offending commit was identified. It literally consists of a single change from

template <typename T, typename U>
class foo {};

to

template <typename T, typename ... Args>
class foo {};

I have confirmed experimentally that applying this single change produces the slowdown mentioned above. Yet more puzzlingly, switching compiler version (from GCC 4.7 to GCC 4.8) moves the slowdown occurrence to another similar commit (i.e., another switch from fixed to variadic arguments, but in a different class bar).

To give a bit of context, this specific performance test case is a very sparse computer algebra problem which is memory-bound and hence very susceptible to efficient cache memory utilisation. This test case has always been a problematic spot in my code (e.g., around GCC 4.4/4.5 I used to have to manually tweak the compiler options managing the detection of cache line sizes in order to extract max performance).

Does anybody have an idea of what could cause this behaviour? Unfortunately, I fear that extracting a reduced test case could be very difficult.

EDIT

For reference, this is the commit that restored good performance behaviour. Unfortunately it consists of the revert to non-variadic code for a bunch of classes (instead of just one class). I will try to come up with a more confined example.

https://gitorious.org/piranhapp0x/mainline/commit/b952c613b42fe480fe4ed2dfd3e683eb9e38e4cd

bluescarni
  • 3,937
  • 1
  • 22
  • 33
  • Is that really an empty class? Or just a simplification? – David Rodríguez - dribeas Sep 24 '13 at 15:34
  • @DavidRodríguez-dribeas: just a simplification. I wanted to make clear that the sole change was the switch to variadic template declaration. The rest of the code remains unchanged and has no knowledge of the difference between variadic and non-variadic versions of the class. – bluescarni Sep 24 '13 at 15:35
  • If the rest of the code remained "unchanged", wouldn't that break everytime someone said `U`? – Kerrek SB Sep 24 '13 at 15:36
  • @KerrekSB: good point. In this case the `Args` are simply passed down to a member object which does have all the machinery implemented to handle variadic bits. It's the change in the top class that breaks things. – bluescarni Sep 24 '13 at 15:39
  • @bluescarni: So the change is not *just* that above, but that and the underlying object using the varargs... the question is quite bad as you stated it, since you are asking about the difference on performance of `{}` and `{}` where those represent similar but different code. -1 – David Rodríguez - dribeas Sep 24 '13 at 15:49
  • @DavidRodríguez-dribeas: fair enough. I just do not know how to convey the concept that, to the best of my knowledge and of my experimentations, all seems to point to some effect related to the change in the class declaration. I know SO does not like non-reduced test cases, so, short of linking to the actual commit, I do not know how to pose the question otherwise. I just wanted to know if anyone had some obvious pointers. – bluescarni Sep 24 '13 at 15:53
  • @bluescami Create a toy version of your problem that exhibits the problem. If you cannot, definitely don't literally lie and claim that something that isn't the "only thing you changed" is the only thing you changed. – Yakk - Adam Nevraumont Sep 25 '13 at 15:07

1 Answers1

5

It's a broad question and the usual suspect (as far as I'm concerned) is recursive handling of variadic template parameters in the generated code.

You need to check if the methods that now use the variadic template parameter are implemented in a way that recursion only happens at compile-time, not at run-time. To give you some idea, you might want to look at some examples, e.g., this answer of mine. Recursion happens at compile-time and the real code is single-step forwarding and expanding.

Despite what you wrote, I do expect that you actually had to adapt some code, as otherwise Args could never hold anything else than a single parameter and it would be totally pointless to have a variadic template parameter - forgive me if I'm wrong ;) (From your comment it might trigger something like the above in the code where you pass the parameter pack to)

Community
  • 1
  • 1
Daniel Frey
  • 55,810
  • 13
  • 122
  • 180
  • Yeah sorry for the ninja edit. Thanks for the pointer, gonna do some reading. I am pretty sure there is not much recursion going on in my case (I was just trying to avoid placing a default template argument in all the upper hierarchy of my classes - make a default argument in the base class and use variadic in the top), but I am gonna double check again. – bluescarni Sep 24 '13 at 15:42