Will the compiler optimize functions which return structures with fixed size arrays?

Question

Assuming I have a struct in C/C++ with fixed size array members, for example:

#define SIZE 10000
struct foo{
  int vector_i[SIZE];
  float vector_f[SIZE];
};

and I would like to create a function that will return an instance of foo, like:

foo func(int value_i, float value_f){
  int i;
  foo f;
  for(i=0;i<SIZE;i++) f.vector_i[i] = value_i;
  for(i=0;i<SIZE;i++) f.vector_f[i] = value_f;
  return f;
}

If I call the function using:

foo ff = func(1,1.1);

will the compiler perform some kind of optimization (ie TCO)?

Will the executable fill directly ff variable, or it will fill first f of func and then copy all values from f to ff?

How can I check if the optimization is performed?

This is quite a huge object to be kept as a local variable. If I calculated right, it should be 625kb (on a 32bit platform), when on windows I think each thread stack can be up to 1mb — george_ptr, May 12 '16 at 11:07
Pass a pointer (or reference) to the structure as an argument instead, and you simply don't have to worry about it. — Some programmer dude, May 12 '16 at 11:08
You could look at the generated assembler. In C++, you could also define a copy constructor and see if it gets called. — Karsten Koop, May 12 '16 at 11:09
Identifiers beginning with an underscore and uppercase letter are reserved and defining `_SIZE_` in C will invoke *undefined behavior*. — MikeCAT, May 12 '16 at 11:09
@GeorgeAl this is just an example. Suppose that `SIZE=100` and I use this function `100000` times. My question is, if coping data from `f` to `ff` is avoided by compiler — ztik, May 12 '16 at 11:10
See this http://stackoverflow.com/questions/9653072/return-a-struct-from-a-function-in-c or use *pointers* and return pointer to the struct you manipulated inside the function. — ralf htp, May 12 '16 at 11:11
@JoachimPileborg but this is exactly what he wants to find out, right? He could put a cout in the copy constructor and see what happens. — Karsten Koop, May 12 '16 at 11:12
If you want to look at optimizations done, write a minimal test program, then get assembly output from your compiler, and examine that to see what really happens. There's really no other way to know for sure, optimizations are not standardized (other than optimized code must behave as-if no optimizations were done, except for undefined behavior, which can turn things really nasty with optimizations enabled). — hyde, May 12 '16 at 11:14
See func as a factory function. Create struct on heap and return a shared_ptr, or possibly even a unique_ptr. — Erik Alapää, May 12 '16 at 11:14
There is no language C/C++, but only trhe two **different** languages C and C++. The code shown is not valid in C, but in C++, so I removed the C tag as inappropriate. — too honest for this site, May 12 '16 at 11:25
Off topic: TCO (Tail Call Optimisation) is transforming a function call in tail position into a jump. I think you meant RVO. — molbdnilo, May 12 '16 at 11:27
@Olaf: For most practical purposes, ANSI C 89 (the C dialect most C code in the world is written in) is very close to a true subset of C++. — Erik Alapää, May 12 '16 at 11:32
@ErikAlapää: 1) standard C is ISO9899, current version 2011. 2) C89 is **not** standard C 3) The only reason C89, resp. C90 is still in use are broken, ancient compilers like MSVC which have not evolved since ca. 17 years. 4) I strongly doubt most users still use C90 without a reliable statistics 5) C++ also has evolved. The "common subset" has shrinked to being inappropriate to implement any larger project. For the same reason we don't write programs completely in Assembler. 6) Identical syntax/grammar does not imply identical semantics. E.g. see `const`. — too honest for this site, May 12 '16 at 11:38
@ErikAlapää: I will not discuss this further. It is commonly accepted they are different languages for various reasons, one example I listed. Just look around. However told you this does not know at least one of the two languages well enough to write useful production code in it. See the accepted answer, for this question there already can be differences. — too honest for this site, May 12 '16 at 11:41
@Olaf I altered the source so that it will be `C` compliant. However my question was: Are `C++` compilers going to perform any optimization? Are `C` compilers going to perform any optimization? — ztik, May 12 '16 at 11:42
@Olaf: I did not claim C89 is standard C in 2016. For most practical purposes, C is a subset of C++. Especially so since gcc allows C99 constructs in C++. One particular feature I like, since C++ the best language for high-performance programming, is keyword restrict. — Erik Alapää, May 12 '16 at 11:43
@ztik: The accepted answer is C++-specific and both are different languages (dspite what some minority might claim). So please leave this at C++. Ask another question about C, if you are really interested. — too honest for this site, May 12 '16 at 11:43
@ErikAlapää: Repetitiion does not make your statment true. And the last sentence is **pure nonsense** and just a personal opinion! Learn true high-level languages like Python, etc. The optimal language stronglöy depends on the actual problem. C is optimal for certain problems, C++ for some others, Python for a third one. There are fields more than language is sufficient, then it is a personal matter. Also other aspects are relevant (availability of a language, libraries, etc). — too honest for this site, May 12 '16 at 11:47
@Olaf, no nonsense here. Are you really claiming that C is far from being a pure subset of C++? Many codebases are actually mixing C and C++, with C used for really low-level stuff where C++ abstractions are not essential. Anyway, no point in taking this debate further. — Erik Alapää, May 12 '16 at 12:08
You can use http://gcc.godbolt.org/ to check the generated assembly online for different compilers. — jotik, May 12 '16 at 12:52

eerorika · Accepted Answer · 2016-05-12T13:16:18.817

7

My answer applies to c++.

Will the compiler perform some kind of optimization (ie TCO)?

By TCO do you mean "tail call optimization"? The function doesn't make a function call at the end (a tail call, if you will), so that optimization doesn't apply.

The compiler can elide the copy from the return value to the temporary due to named return value optimization. The copy-initialization from the temporary can also be elided.

How can I check if the optimization is performed?

By reading the generated assembly code.

If you can't read assembly, another approach would be to add copy and move constructors that have side effects and observe whether those side effects occur. However, modifying the program can have effect on whether the compiler decides to optimize (but side effects are not required to prevent copy elision).

If you don't want to rely on optimization, you should explicitly pass an exiting object to the function by reference (pointer in c), and modify it in place.

Standard reference for copy elision [class.copy] §31 (current standard draft)

When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class object, even if the constructor selected for the copy/move operation and/or the destructor for the object have side effects. [...]

The section describes the criteria, which are met in this case. The quote was generated from the standard document draft at 2016-04-07. Numbering may vary across different versions of the standard document and rules have slightly changed. The quoted part has been unchanged since c++03, where the section is [class.copy] §15.

edited May 12 '16 at 13:16

answered May 12 '16 at 11:15

eerorika

232,697
12
197
326

Wouldn't not invoking these side-effects violate code-correctness? Not sure about C++, but in C, there is something called "observable behaviour" which must not change by compiler optimisations. – too honest for this site May 12 '16 at 11:29
4

@Olaf copy elision has a special exception to ignore as-if rule. I don't know about c, so I added a disclaimer (although, copying cannot have side effects besides the copy itself in c anyway, can it?). – eerorika May 12 '16 at 11:31
@user2079303 Copying in C doesn't have any side effects, so it's a non-issue there. – molbdnilo May 12 '16 at 11:34
Thanks! do you know if compilers consider this optimization as a standard (perform from -O0) or it is included in higher optimization values? – ztik May 12 '16 at 11:36
1

@ztik I've seen gcc apply copy elision with -O0 – eerorika May 12 '16 at 11:38
Thanks, it's enough! – ztik May 12 '16 at 11:38
@ztik: There is no guarantee this is done by a specific compiler or other. If you are after a guarantee, pass the struct by reference (C++ only). – too honest for this site May 12 '16 at 11:51
@user2079303: Can you please provide a reference to the C++ standard? At first glance, this looks like a good source of trouble to me. – too honest for this site May 12 '16 at 11:53
@Olaf standard reference provided. – eerorika May 12 '16 at 12:01
@user2079303: Thanks. However, as that is a draft, it would be helpful to provide the version number and for which standard version the draft is. – too honest for this site May 12 '16 at 12:09
@Olaf I cannot find the version number of the draft from the site, but there is a generation date. – eerorika May 12 '16 at 12:16
Hmm, from the date I'd say it is for the next version (C++17). I think it is fine then, but maybe it is better to cite the valid standard in the first place - not that I expect this to have changed significantly. Would break too much code, I think. However, this is all optinal. It completely depends how much work the compiler developers put into this. (I know some quite expensive commercial embedded compilers which don't implement even less complex optimisations). – too honest for this site May 12 '16 at 12:30
@Olaf (n)rvo and temporary copy elision has existed since c++03 at least, there the section is `[class.copy] §15`. Conditions have been relaxed a bit in some version, but as far as this particular case is concerned, there has been no change since 03. I don't have a copy of c++98, but I would expect that it was similar to c++03. And I agree, if one cannot depend on optimization, they should pass a reference instead of returning a value. – eerorika May 12 '16 at 12:54
Actually that's what goes behind the scenes anyway: The caller allocates a temporary `struct` and passes a reference or pointer to the callee. Btw. the question is incomplete, as it completely ignores how/if the caller uses a temporary object for the call. Quite some places for unnecessary trouble. Worse considering memory consumption. I definitively would play safe here and not pass the object by value. – too honest for this site May 12 '16 at 13:06

Maxim Egorushkin · Answer 2 · 2016-05-12T12:18:01.577

This is pretty well documented in Agner Fog's Calling Conventions document, § 7.1 Passing and returning objects, Table 7. Methods for returning structure, class and union objects.

A struct, class or union object can be returned from a function in registers only if it is sufficiently small and not too complex. If the object is too complex or doesn't fit into the appropriate registers then the caller must supply storage space for the object and pass a pointer to this space as a parameter to the function. The pointer can be passed in a register or on the stack. The same pointer is returned by the function. The detailed rules are given in table 7.

In other words, large return objects get constructed directly in the caller supplied buffer (on the caller's stack).

An extra copy is still required if the identity of the object to return is not known at compile time, e.g.:

foo func(bool a) {
    foo x, y;
    // fill x and y
    return a ? x : y; // copying is required here
}

Will the compiler optimize functions which return structures with fixed size arrays?

2 Answers2

Linked