What is performance penalty for using aggregates in C++?

Question

Consider this example with a structure S constructed and passed as an argument to a function:

struct S
{
    S() {}
    float vals[64];
};

inline S makeS() { return {}; }

void foo(const S &);

void bar() { foo( makeS() ); }

Looking at the assembly the compilers produce (https://godbolt.org/z/odoPr9836), it is pretty small (as expected).

But if we eliminate the constructor from S thus converting it in an aggregate (https://godbolt.org/z/qEWKbx5Ps), the assembly becomes many times more bulky.

It even seems that MSVC does not perform mandatory RVO and copies the aggregate:

...
movups  XMMWORD PTR [rcx-128], xmm0
movups  xmm0, XMMWORD PTR [rax-96]
movups  XMMWORD PTR [rcx-112], xmm1
movups  xmm1, XMMWORD PTR [rax-80]
...

Does it follow from this, that on practice the usage of aggregates produce much less optimal code compared to normal C++ classes or structures?

Relative https://stackoverflow.com/questions/47853659/can-copy-elision-be-perfomed-in-aggregate-initialization-in-c17 Aggregate initialization basically performs element-wise copy-initialization. — 273K, Jul 24 '21 at 20:04
*"Does it follow..."* - No, it doesn't follow. You see a difference **only** because your comparison is unfair. **Aggregate-initialization** has to, by definition, have every member initialized. And those you don't provide for are recursively initialized from `{}`. That means you *asked* to zero out the array. To make a fair comparison try `inline S makeS() { S s; return s; }` as well, so default initialization is performed on your aggregate (and its sub-objects just as your c'tor does). C++ is overly expert friendly. An unpleasant truth, but there you have it. — StoryTeller - Unslander Monica, Jul 24 '21 at 20:32
The same as what? I see the same assembly with or without constructor when default initialization is forced. You still don't seem to be comparing apples to apples. — StoryTeller - Unslander Monica, Jul 24 '21 at 20:44
@StoryTeller i don't think that is doing the same. It's undefined behavior because you copy not yet initialized values, right? — Johannes Schaub - litb, Jul 25 '21 at 10:02
Which is what the OP did with their constructor. Of course, is it formally an lvalue-to-rvalue conversion when the compiler does behind the scenes magic? I'm not sure. — StoryTeller - Unslander Monica, Jul 25 '21 at 10:06

Ranoiaetep · Answer 1 · 2021-07-24T20:50:19.497

You comparison is unfair.

struct S
{
    S() {}
    float vals[64];
};

With this, you didn't actually initialize vals. To make it a fair comparison, it should've been:

struct S
{
    S() : vals{} {}
    float vals[64];
};

Or simply use:

S() = default;

Now gcc and clang both produce identical code for them, where msvc produce the same code for makeS: https://godbolt.org/z/rezxTorGs

Also someone please correct me:

struct S
{
    float vals[64];
};

inline S makeS() { return {}; }

I believe this one uses default constructor instead of aggregate initialization.

You might need to use return {{}}; or return {.vals = {}}; to force aggregate initialization.

It's aggregate initialization still. No constructor call. – StoryTeller - Unslander Monica Jul 24 '21 at 20:56 — StoryTeller - Unslander Monica, Jul 24 '21 at 20:56

What is performance penalty for using aggregates in C++?

1 Answers1