8

Granted, micro-optimization is stupid and probably the cause of many mistakes in practice. Be that as it may, I have seen many people do the following:

void function( const double& x ) {}

instead of:

void function( double x ) {}

because it was supposedly "more efficient". Say that function is called ridiculously often in a program, millions of times; does this sort of "optimisation" matter at all?

Jonathan H
  • 7,591
  • 5
  • 47
  • 80
  • passing by reference can also be used to return more than one value from a function. – suspectus Jan 07 '14 at 21:18
  • @suspectus Sure yes, but that's not my question :) – Jonathan H Jan 07 '14 at 21:19
  • 1
    @suspectus not with const applied –  Jan 07 '14 at 21:20
  • related: http://stackoverflow.com/a/7958278/332733 – Mgetz Jan 07 '14 at 21:21
  • @Mgetz You are absolutely right, sorry! Let me read this post, and I might close my question then. Sorry again, I didn't find that one.. – Jonathan H Jan 07 '14 at 21:23
  • *"millions of times; I am personally convinced that this doesn't makes any difference.."* - Why would you be convinced of that? Of course it can (and will, on many platforms) make a difference; a 4 byte pointer is smaller than an 8 byte `double`. The question is; will that difference actually matter in my (your) code? To answer that question a specific example is needed. – Ed S. Jan 07 '14 at 21:23
  • @EdS. Agreed, that's kind of why I'm asking the question.. I understand how it could save some time, but what kind of gain are we looking at? Microseconds? Seconds? – Jonathan H Jan 07 '14 at 21:27
  • @Sh3ljohn: it's clearly much harder for some random stranger on the internet to tell you what difference it makes *specifically to your code*, than it is for you to make a one line change to your program and compare the two with a stopwatch. – Steve Jessop Jan 07 '14 at 21:33
  • @SteveJessop You're right, but I don't have a specific code. I'm just asking if this is the kind of detail you should pay attention to when you're programming with "small" scalar types basically, and if you should, when? – Jonathan H Jan 07 '14 at 21:36
  • See also [this answer](https://stackoverflow.com/a/29142434/472610), [this answer](https://stackoverflow.com/a/4705871/472610), and [this post](https://www.bfilipek.com/2016/12/please-declare-your-variables-as-const.html). – Jonathan H Sep 04 '19 at 12:07

4 Answers4

12

Long story short no, and particularly not on most modern platforms where scalar and even floating point types are passed via register. The general rule of thumb I've seen bandied about is 128bytes as the dividing line between when you should just pass by value and pass by reference.

Given the fact that the data is already stored in a register you're actually slowing things down by requiring the processor to go out to cache/memory to get the data. That could be a huge hit depending on if the cache line the data is in is invalid.

At the end of the day it really depends on what the platform ABI and calling convention is. Most modern compilers will even use registers to pass data structures if they will fit (e.g. a struct of two shorts etc.) when optimization is turned up.

Mgetz
  • 5,108
  • 2
  • 33
  • 51
  • @AntonTykhyy it's really a judgement call thing, personally I would keep it smaller but that's me. I've seen 128 bytes in more than one place though. – Mgetz Jan 07 '14 at 21:40
  • @Basilevs people forget they are on multitasking systems... and that even a stack that should be close by may be a LOT farther away due to an interrupt or tick ending. – Mgetz Jan 07 '14 at 21:41
3

Passing by reference in this case is certainly not more efficient by itself. Note that qualifying that reference with a const does not mean that the referenced object cannot change. Moreover, it does not mean that the function itself cannot change it (if the referee is not constant, then the function it can legally use const_cast to get rid of that const). Taking that into account, it is clear that passing by reference forces the compiler to take into account possible aliasing issues, which in general case will lead to generation of [significantly] less efficient code in pass-by-reference case.

In order to take possible aliasing out of the picture, one'd have to begin the latter version with

void function( const double& x ) {
  double non_aliased_x = x;
  // ... and use `non_aliased_x` from now on
  ...
}

but that would defeat the proposed reasoning for passing by reference in the first place.

Another way to deal with aliasing would be to use some sort of C99-style restrict qualifier

void function( const double& restrict x ) {

but again, even in this case the cons of passing by reference will probably outweigh the pros, as explained in other answers.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
2

In the latter example you save 4B of being copied to stack during function call. It takes 8B to store doubles and only 4B to store a pointer (in 32b environment, in 64b it takes 64b=8B so you don't save anything) or a reference which is nothing more than a pointer with a bit of compiler support.

SOReader
  • 5,697
  • 5
  • 31
  • 53
  • 3
    Or conversely, passing by reference *might* force a `double` stored in a register or on the FPU stack to be copied to the CPU stack. Swings and/or roundabouts. – Steve Jessop Jan 07 '14 at 21:26
2

Unless the function is inlined, and depending on the calling convention (the following assumes stack-based parameter passing, which in modern calling conventions is only used when the function has too many arguments*), there are two differences in how the argument is passed and used:

  • double: The (probably) 8 byte large value is written onto the stack and read by the function as is.
  • double & or double *: The value lies somewhere in the memory (might be "near" the current stack pointer, e.g. if it's a local variable, but might also be somewhere far away). A (probably) 4 or 8 byte large pointer address (32 bit or 64 bit system respectively) is stored on the stack and the function needs to dereference the address to read the value. This also requires the value to be in addressable memory, which registers aren't.

This means, the stack space required to pass the argument might be a little bit less when using references. This not only decreases memory requirement but also cache efficiency of the topmost bytes of the stack. When using references, dereferencing adds some piece of work more to do.

To summarize, use references for large types (let's say when sizeof(T) > 32 or maybe even more). When stack size and hotness plays a very important role maybe already if sizeof(T) > sizeof(T*).


*) See the comments on this and SOReader's answer for what's happening if this is not the case.

leemes
  • 44,967
  • 21
  • 135
  • 183
  • Ok then, all the comments seem to go in the same direction, I like your answer and that of MGetz – Jonathan H Jan 07 '14 at 21:35
  • 2
    Of course, this is all subject to a particular platforms ABI - it's entirely possible that neither the `double` nor the pointer are passed on the stack, but are passed in registers instead, in which case the differentiation is considerably smaller... – twalberg Jan 07 '14 at 21:36
  • @twalberg Yes of course. But most calling conventions use the stack for the arguments. I added a (hopefully) appropriate comment in my answer. Thanks for pointing this out. – leemes Jan 07 '14 at 21:44
  • @leemes I think you might be surprised if you took a look at a collection of modern ABIs. Almost all will use the stack if there are a "large" number of arguments, but if there's only one or two arguments, they're often going to be passed in registers - at least on desktop/server-type systems; embedded ABIs might be quite different in that respect, I guess... – twalberg Jan 07 '14 at 21:48
  • @twalberg Can you point me to some documentation? (I'm especially interested on how GCC does this, but I fail to find some different information than wikipedia, which says "function arguments are pushed on the stack in the reverse order"...) – leemes Jan 07 '14 at 21:52
  • Okay I've now read about "fastcall" and "System V". What I don't get is, how is ensured that the caller and callee use the same convention? How can I link against a library without telling the compiler what convention was used to compile it? But that's an entire different topic. What's important here is that my answer is only correct if the stack is used, and I want to edit it accordingly. – leemes Jan 07 '14 at 21:57
  • @leemes Many ABIs are available online for the searching - the x86 and x86_64 should be pretty easy to find, but SPARC, POWER, MIPS, ARM and many others should be readily available as well... There may be OS-dependent variations, too, so it depends on whether you're looking for Linux, Unix, Solaris, Windows, DOS, AIX, MIPS/OS, ..... Regarding fastcall vs SysV - if caller and callee don't use the same convention, then things will break. Systems usually use one by default, and using the other requires great care... – twalberg Jan 07 '14 at 21:58
  • @leemes: generally speaking the same person who writes down how big `long` is (and other such details) will also write down the calling convention. They're both part of an ABI, and in order for two binaries to be link-compatible they must both use the same ABI. If there are multiple calling conventions in use on a single platform (as for example in Win32) then you can decorate your functions with the calling convention which (like "C" linkage) becomes part of the function signature. – Steve Jessop Jan 07 '14 at 22:01
  • @twalberg That's why I assumed *stdcall* and *cdecl* are the "defaults" in Win32 API and for GCC resp. since that's what Wikipedia says. It further says that they're entirely stack-based for the in-parameters. So this was a false assumption I made? – leemes Jan 07 '14 at 22:01
  • 1
    @leemes: it's true for Win32, but 64-bit Windows code only uses one calling convention, fastcall, which uses up to 4 registers to pass arguments. GCC uses any number of different calling conventions when compiling for different targets. – Steve Jessop Jan 07 '14 at 22:03
  • @SteveJessop to reenforce that point, linux and most other OSes that use x86_64 have selected a single ABI (System V AMD64) which makes extensive use of registers to pass arguments. – Mgetz Jan 07 '14 at 22:10