c++ performance of returning primitive by value or by (const) reference

Question

Assume the following toy class, and modern compilers (recent gcc for example).

template <typename T>
class SomeVec {
public:
  ...
  virtual T get(const int index) = 0;
}

The application involves a fair amount of number crunching based on values stored in SomeVec subclasses, T being a primitive type, say int. However, the practice of stl containers and boost::numeric::ublas::vector is to return stored values via (const) reference.

I wondered the performance differences this involve. In this question it is shown that array element access by value and vector element access by reference results in the same code, so I assume the compiler can in some cases optimize stuff away.

Now my questions are:

(1) stl and ublas are templated, while my solution requires virtual methods. Does this hinder modern compilers' ability to optimize code?
(2) If the compiler could not optimize to return the const atomic reference as value, then do I assume it right that the virtual method call and the dereferencing cost approximately the same? Or is one significantly more expensive than the other?

Thanks!

You have a pure virtual method here, implying you have derived classes. Can you give an example of one? — Oliver Charlesworth, Nov 11 '10 at 22:44
@Oli: Subclasses can be various, but for example a simple implementation would have an internal vector or map and return one of the stored values. Of course this involves many more overhead than the return type imposes, but let's not consider that. — ron, Nov 11 '10 at 23:02

score 3 · Accepted Answer · answered Nov 12 '10 at 00:25

The reason STL returns references is because templated code doesn't have the luxury of knowing that returned objects are small. While an int is no problem, returning a large struct slows things down for no good reason. In the latter case it only makes sense to use references, and since a reasonable compiler can optimise the case of using primitive types you may as well use references throughout.

Note that your method virtual T get(const int index) differs in other ways to STL container methods. Most importantly, and related to the issue above, your method returns a copy of the indexed object: manipulating the result will not change the state of the object in your container.

Also, declaring the index argument as const does nothing, since you pass index in by value and so all you are doing is preventing yourself from changing index locally within implementation. If you were passing index in by reference that would be a different matter, but you should be wary of doing so.

Finally, are you really sure that your class needs to be dynamically polymorphic (i.e., have virtual methods)? The STL containers are intentionally designed not to be inherited (which is why they do not have virtual destructors). Containers are not meant to provide an interface for derived classes, rather they are there to facilitate an implementation. I would argue that the subclass examples you suggest could just as easily be implemented as wrapper classes around a templated container, favouring code reuse through composition over inheritance (something advocated by the Gang of Four, amongst others). Aside from being good practice, avoiding virtual methods saves having vtables and corresponding pointers in your objects, and requiring the extra vtable lookup in each call. If you don't really need dynamic polymorphism, why take the cost (and possibly prevent compiler optimisations)?

score 2 · Answer 2 · answered Nov 11 '10 at 22:51

It's unusual to have a virtual function in a template class. If the function isn't virtual, the compiler will usually inline the code and optimize the difference between a return by reference and a return by value down to nothing.

The compiler might still inline a function if it isn't called via a pointer or reference - the compiler will know the exact member function to be called in that case, and doesn't need to look it up through a vtable.

The expense of having a reference will be small, just a single dereference. It might not even be a whole instruction at the assembly level.

I see, thank you. In my case vtable lookup will be required. As for template + virtual, template is required for different contained types, while abstraction via virtual is because the container element fetching methods can be varied and should be hidden from the user. — ron, Nov 11 '10 at 23:07

c++ performance of returning primitive by value or by (const) reference

2 Answers2