Does the CRTP "pitfall workaround" negate the early binding benefits?

Question

In order to store CRTP object pointers in a homogenous container the templated base class can itself be derived from a class common_base that defines a pure virtual interface and (if required) a virtual destructor. This is sometimes referred to as the CRTP pitfall workaround.

class common_base
{
public:
  virtual ~common_base() {}
};

template<typename T> class base : public common_base {

public:
  void func() {
    printf("base func()\n");
    static_cast<T*>(this)->func();
  }

};

class derived : public base<derived> {

public:
    void func() {
        printf("derived func()\n");
    }
};

derived d;
base<derived>& b = d;

d.func();   // Output: derived func()

b.func();   // Output: base func()
            //         derived func()

All good. Now let's say I want to use the workaround to not only store derived objects in a container, but to also call a common interface on those objects. So I change common_base to be:

class common_base
{
public:
  virtual ~common_base() {}
  virtual void func() = 0;
};

derived d;
base<derived>& b = d;
common_base& c = d;

d.func();   // Output: derived func()

b.func();   // Output: derived func()

c.func();   // Output: derived func()

The above pure virtual function pattern is present in references I've seen but it looks like it causes a vtable to be used in resolving the func() calls at runtime, which negates the performance benefit of the compile time polymorphism of CRTP. Is it not possible to declare a common base interface with CRTP early binding in this way?

References:

Modern C++ Programming Cookbook - Second Edition

Wikipedia

The bottom-line question you should ask if `c.func()` can have enough information to resolve the dispatch at compile time. The answer is no because `common_base` knows nothing about the derived `T`. There is not enough type information to resolve the call to `T::func` because `common_base` erased it. — Quimby, Aug 30 '23 at 06:31
Would `d.func();` use run-time vtable dispatch or not, depends on implementation and optimization mode. — Swift - Friday Pie, Aug 30 '23 at 06:32
*"CRTP pitfall workaround"* explained in the first paragraph is not a thing. Adding abstract base interface won't make it possible to store CRTP objects in a homogenous container. *"but to also call a common interface on those objects"* - CRTP performs compile-time calls of "virtual" methods from inside of CRTP class and related templates so in order to use it you'll need to provide an instance of derived class. If you use an abstract base class with actual virtual methods then you essentially hit the runtime dispatch switch. — user7860670, Aug 30 '23 at 07:19
@user7860670 You probably had mean "contigous container". Homegenous container includes one, which would store just pointers to same type.. Technically the former is possible, but that requres storing some meta-type information within base class (its size, type, etc.) — Swift - Friday Pie, Aug 30 '23 at 07:31
can you provide a reference for "CRTP pitfall workaround" ? I tried to research it, but the only thing I found was this question — 463035818_is_not_an_ai, Aug 30 '23 at 07:44
@Swift-FridayPie Storing *pointers* to CRTP objects in a homogenous container is not the same as storing CRTP objects in a homogenous container though. — user7860670, Aug 30 '23 at 07:59
@user7860670 homogenum supposes "same type". In C++ *nothing* allows to store different types in a homogenous container without type erasure. — Swift - Friday Pie, Aug 30 '23 at 08:11
@463035818_is_not_an_ai edited to add the references where this is described. — sleep, Aug 31 '23 at 00:10
@user7860670 you're right the workaround is just storing pointers to the abstract base class. — sleep, Aug 31 '23 at 00:13

n. m. could be an AI · Accepted Answer · 2023-08-31T12:25:49.077

CRTP (or anything else for that matter) is not a way to have benefits of late binding without actually having late bindings. If late bindings were replaceable by a better mechanism, they would have been replaced. They have not, and they are not.

CRTP is a way of having common code base (living in a template) without having a common base class (because templates are not classes). Absence of a common base class is the whole point. Since there is no common base class, the dynamic type is either the same as the static type or is embedded in the static type as a template parameter, so it's always statically known, and no dynamic dispatch is needed. You cannot reintroduce a common base class into the picture and still use static dispatch everywhere. That would require a miracle, and miracles are forbidden in C++.

So does this "CRTP pitfall workaround" actually achieve anything? Let's have a closer look. The book referenced in the question shows an example:

class controlbase
{
public:
  virtual void draw() = 0;
  virtual ~controlbase() {}
};
template <class T>
class control : public controlbase
{
public:
  virtual void draw() override
  {
    static_cast<T*>(this)->erase_background();
    static_cast<T*>(this)->paint();
  }
};

Derived classes like class button : public control<button> implement erase_background and paint.

In this setting, draw is called with the dynamic dispatch mechanism, while erase_background and paint are dispatched statically. (There is no contradiction with the above: only draw belongs to the common base class, and calls to draw need to be dynamically dispatched; other calls are not made to methods of the common base class, and those can be in principle statically dispatched). So there's a net win: all calls but one are effectively devirtualised. On the other hand, a competent optimiser should in principle be able to always devirtualise those other calls without the programmer having to resort to CRTP, although I don't know of a compiler that is able to do it.

So in terms of indirection the workaround is only beneficial if the number of static dispatches done by the virtual (`draw()`) is *more than two*, since the virtual itself incurs 2 indirections (vtable lookup + pointer). Otherwise we may as well just use virtual dispatch. In the example there are 4 indirections total (2 for virtual call + 2 calls via pointer), which is the same as if `erase_background()` and `paint()` were both dispatched virtually (2 + 2). Although the cache performance of each approach may be different due to memory layout. — sleep, Sep 01 '23 at 03:43
@sleep Not sure how your arithmetic works, you either have (in a straightforward implementation) 2 virtual calls + 1 non-virtual (assuming `draw` is non-virtual), or (in a CRTP implementation) 1 virtual + 2 non-virtual. — n. m. could be an AI, Sep 01 '23 at 04:11
So in your calculations a non-virtual call to `erase_background` and a non-virtual call to `paint` cost 1 each, while a non-virtual call to `draw` costs 0. How are the two mechanisms different? Have you looked at the actual assembly? (If `draw` can be inlined, so can be the other non-virtual calls). — n. m. could be an AI, Sep 02 '23 at 10:02

Does the CRTP "pitfall workaround" negate the early binding benefits?

1 Answers1