C: Does value return from the function take more CPU cycles than void?

Question

If using a void function in C versus a function that returns an arbitrary type (say int), will int function take more CPU cycles than a void function?

Example:

int a, b = 1, c = 2;

void f()
{
     a = b + c;
}

int g()
{
     a = b + c;
     return a;
}

My common sense tells me that a return is an action, so it should take some of the CPU time, but I don't have proper deep fundamental knowledge that is needed here, nor do I know assembler to answer this question confidently by myself. Googling around was not successful either.

Edit: My interest is purely academic and I don't expect to gain any noticeable (or even close to that) amount of performance by using void versus int functions.

Yes – typically, a return value must be loaded into a target register (although a compiler may optimize it so your code would load it into that register *anyway*). But do your concerns disappear if you look at the actual [timing latency (PDF)](http://www.agner.org/optimize/instruction_tables.pdf) of a typical `mov` instruction? It is '1', meaning (in its most basic explanation) that for a 3.04GHz computer such as my Mac, it takes 1/3,040,000,000 of a second. — Jongware, Jul 13 '18 at 21:44
What does the optimized assembler output say? That's the way to find out, that and a lot of benchmarking if this is really important to you. Hint: It's probably not. — tadman, Jul 13 '18 at 21:44
@tadman You are right, it's not. My interest here is purely academic, and I don't expect to get any performance boost from learning how return values are handled. As for assembler, I did not generate the assembly code simply because I don't know it (yet). — ZenJ, Jul 13 '18 at 22:18
The assembly code is where you learn the most as the compiler will tell you how it interpreted your code. Figure out how you can get that "disassembly" output and you'll have all the answers you need, or at least be a lot closer to them. — tadman, Jul 13 '18 at 22:20

score 5 · Answer 1 · answered Jul 13 '18 at 21:46

That entirely depends on the CPU instruction set and calling conventions.

If, for example, the return value is always returned in a specific register, and the compiler can arrange the result of the calculation b+c to be in that specific register before inserting the return instruction, the code generated for those two functions may be identical.

However, this is not the kind of thing you want to think about optimizing in your program unless you exhausted all other options for performance improvements. And you certainly did not.

I fully agree that this is not how one should optimize the code. My interest here is purely academic, and I don't expect to get any performance boost from learning how return values are handled. — ZenJ, Jul 13 '18 at 22:32

Aganju · Answer 2 · 2018-07-13T21:52:56.137

3

I don't think the question makes much sense.

If you need the returned value, you don't have the option to use void to speed it up - even if it would be faster.
If you don't need the result, it is useless to return it, so just don't do it. Either way, the choice is defined by the needs of the caller.

Typically, modern compilers don't return the value, but construct it in place. For example if you write int sum = f(a,b); the compiler will never make the temporary in your function, and instead use the memory of sum to store the result. That means there is no difference in execution time.

edited Jul 13 '18 at 21:52

answered Jul 13 '18 at 21:46

Aganju

6,295
1
12
23

Actually sometimes this may come handy: for example if a function modifies some globals or class member variables but the return is not needed in every place where function is used. But regardless I would like to apply my question only to a perfect 'academic knowledge' world, so basically it's a "what if" question that may not have a practical meaning. I apologize for not stating it clearly in the first place – ZenJ Jul 13 '18 at 22:29
In your example case, the compiler knows when you use the result and when not, and constructs it in-place when you use it, and never returns it if you toss it. In other words: if you define a return value, and never use it anywhere, it is simply never returned, so there is no difference. – Aganju Jul 13 '18 at 22:30

score 3 · Accepted Answer · answered Jul 13 '18 at 22:12

3

On an x86_64 system, both functions can be compiled to the same code. I've "glossed" the disassembly with roughly equivalent C code:

f_or_g:
    pushq   %rbp                 ; // Standard stack frame setup
    movq    %rsp, %rbp           ; // same
    movl    OFFSET1(%rip), %eax  ; eax = c;
    addl    OFFSET2(%rip), %eax  ; eax += b;
    movq    OFFSET3(%rip), %rcx  ; rcx = &a;
    movl    %eax, (%rcx)         ; *rcx = eax;
    popq    %rbp                 ; // Standard stack frame teardown
    retq                         ; return

Since x86_64 uses eax as the return register for 32-bit values, the result of the addition is in the "right place" to return it already -- no extra code is needed.

In more complex functions, there may be some minor overhead required to ensure that the return value ends up in the right register. Generally speaking, though, that overhead should be pretty minimal.

The same principle applies to most other architectures -- this isn't specific to x86_64; I'm just using it because that's the first compiler that came to hand.

answered Jul 13 '18 at 22:12

Did I understand correctly that any processor that uses `eax` (or similar I guess -- sorry if I just said some nonsense!) will not require extra code? – ZenJ Jul 13 '18 at 22:20
This basically depends on the value you're returning being the last thing you calculated in the function, so it can ensure that the calculation puts its result in the same register. – Barmar Jul 13 '18 at 22:44
1

@ZenJ The exact name of the register doesn't matter. The only really important factor is that the architecture's calling convention uses registers for return values (most do), and that the architecture doesn't require certain registers to be used for specific operations (most don't; x86_64 does a bit but it doesn't come into play here). – Jul 13 '18 at 22:51
1

@Barmar It doesn't specifically have to be the _last_ value; there just have to be enough registers available that any work "after" the calculation of the return value can be performed without touching the return register. – Jul 13 '18 at 22:52
Would you be ok to include info from comments into the answer? I think they really add value to it. – ZenJ Jul 17 '18 at 22:50

score 0 · Answer 4 · answered Jul 14 '18 at 04:11

we cannot say that whether there will be extra cycles usage because it purely depends o your processor or cpu,and moreover if you need to return the value then return statement just accesses the memory location of that element only hence there may be only a small change in complexity which is negligible.

C: Does value return from the function take more CPU cycles than void?

4 Answers4