The standard pitch for expression templates in C++ is that they increase efficiency by removing unnecessary temporary objects. Why can't C++ compilers already remove these unnecessary temporary objects?
This is a question that I think I already know the answer to but I want to confirm since I couldn't find a low-level answer online.
Expression templates essentially allow/force an extreme degree of inlining. However, even with inlining, compilers cannot optimize out calls to operator new
and operator delete
because they treat those calls as opaque since those calls can be overridden in other translation units. Expression templates completely remove those calls for intermediate objects.
These superfluous calls to operator new
and operator delete
can be seen in a simple example where we only copy:
#include <array>
#include <vector>
std::vector<int> foo(std::vector<int> x)
{
std::vector<int> y{x};
std::vector<int> z{y};
return z;
}
std::array<int, 3> bar(std::array<int, 3> x)
{
std::array<int, 3> y{x};
std::array<int, 3> z{y};
return z;
}
In the generated code, we see that foo()
compiles to a relatively lengthy function with two calls to operator new
and one call to operator delete
while bar()
compiles to only a transfer of registers and doesn't do any unnecessary copying.
Is this analysis correct?
Could any C++ compiler legally elide the copies in foo()
?