Consider this example with a structure S
constructed and passed as an argument to a function:
struct S
{
S() {}
float vals[64];
};
inline S makeS() { return {}; }
void foo(const S &);
void bar() { foo( makeS() ); }
Looking at the assembly the compilers produce (https://godbolt.org/z/odoPr9836), it is pretty small (as expected).
But if we eliminate the constructor from S
thus converting it in an aggregate (https://godbolt.org/z/qEWKbx5Ps), the assembly becomes many times more bulky.
It even seems that MSVC does not perform mandatory RVO and copies the aggregate:
...
movups XMMWORD PTR [rcx-128], xmm0
movups xmm0, XMMWORD PTR [rax-96]
movups XMMWORD PTR [rcx-112], xmm1
movups xmm1, XMMWORD PTR [rax-80]
...
Does it follow from this, that on practice the usage of aggregates produce much less optimal code compared to normal C++ classes or structures?