As part of a system design, we need to implement a factory pattern. In combination with the Factory pattern, we are also using CRTP, to provide a base set of functionality which can then be customized by the Derived classes.
Sample code below:
class FactoryInterface{
public:
virtual void doX() = 0;
};
//force all derived classes to implement custom_X_impl
template< typename Derived, typename Base = FactoryInterface>
class CRTP : public Base
{
public:
void doX(){
// do common processing..... then
static_cast<Derived*>(this)->custom_X_impl();
}
};
class Derived: public CRTP<Derived>
{
public:
void custom_X_impl(){
//do custom stuff
}
};
Although this design is convoluted, it does a provide a few benefits. All the calls after the initial virtual function call can be inlined. The derived class custom_X_impl call is also made efficiently.
I wrote a comparison program to compare the behavior for a similar implementation (tight loop, repeated calls) using function pointers and virtual functions. This design came out triumphs for gcc/4.8 with O2 and O3.
A C++ guru however told me yesterday, that any virtual function call in a large executing program can take a variable time, considering cache misses and I can achieve a potentially better performance using C style function table look-ups and gcc hotlisting of functions. However I still see 2x the cost in my sample program mentioned above.
My questions are as below: 1. Is the guru's assertion true? For either answers, are there any links I can refer. 2. Is there any low latency implementation which I can refer, has a base class invoking a custom function in a derived class, using function pointers? 3. Any suggestions on improving the design?
Any other feedback is always welcome.