9

In C#, the parameters to a method can be either reference types or value types. When passing reference types, a copy of the reference is passed. This way, if inside a method we try to reassign the passed reference to another object instance, outside of the method the reassignment is not visible.

To make this working, C# has the ref modifier. Passing a reference type with ref actually uses the original reference instead of a copy. (Correct me if I'm wrong).

In this case, since we are not creating a copy of the reference, are we saving any memory? If a method is extensively called, does this improve the overall performance of the application?

Thanks!

abatishchev
  • 98,240
  • 88
  • 296
  • 433
Mister Smith
  • 27,417
  • 21
  • 110
  • 193
  • 1
    I don't think you should actually be motivated (at least not in this case) by performance. Like Mehrdad said in his answer, you should only ever use ref if you need to change the reference from within the method. And if your method must set it's value, consider even using "out" keyword, not the "ref". – Kornelije Petak Aug 24 '11 at 07:28
  • 4
    in either case, copying a reference is *not* your bottleneck. – vidstige Aug 24 '11 at 07:41
  • @Kornelije Petak You mean I should not assume any internal pointer arithmetic is taking place, and instead see the ref modifier as just a mechanism to allow reference modification? – Mister Smith Aug 24 '11 at 07:47
  • @vidstige I know that if it saved any memory, we're talking about a few bytes. But what if the method is called, let's say, millions of times? – Mister Smith Aug 24 '11 at 07:49
  • 3
    @Mister Smith: Exactly. The size of a `ref` pointer is the same as that of an Object reference, so you're gaining *nothing* here -- just making your app slower (by a very small amount). – user541686 Aug 24 '11 at 07:54
  • 1
    @Misters the savings would have been on the stack. Sequential calling would re-use the memory, recursive calls would get you a StackOverFlow long before your millionth call. – H H Aug 24 '11 at 08:31

4 Answers4

18

Claim

No, it doesn't. If anything, it's slower because of the extra lookup.

There's no reason to pass a reference type by reference unless you specifically intend to assign to it later.


Proof

Since some people seem to think that the compiler passes "the variable itself", take a look at the disassembly of this code:

using System;

static class Program
{
    static void Test(ref object o) { GC.KeepAlive(o); }

    static void Main(string[] args)
    {
        object temp = args;
        Test(ref temp);
    }
}

which is (on x86, for simplicity):

// Main():
// Set up the stack
00000000  push        ebp                    // Save the base pointer
00000001  mov         ebp,esp                // Set up stack pointer
00000003  sub         esp,8                  // Reserve space for local variables
00000006  xor         eax,eax                // Zero out the EAX register

// Copy the object reference to the local variable `temp` (I /think/)
00000008  mov         dword ptr [ebp-4],eax  // Copy its content to memory (temp)
0000000b  mov         dword ptr [ebp-8],ecx  // Copy ECX (where'd it come from??)
0000000e  cmp         dword ptr ds:[00318D5Ch],0  // Compare this against zero
00000015  je          0000001C               // Jump if it was null (?)
00000017  call        6F910029               // (Calls some internal method, idk)

// THIS is where our code finally starts running
0000001c  mov         eax,dword ptr [ebp-8]  // Copy the reference to register
0000001f  mov         dword ptr [ebp-4],eax  // ** COPY it AGAIN to memory
00000022  lea         ecx,[ebp-4]            // ** Take the ADDRESS of the copy
00000025  call        dword ptr ds:[00319734h] // Call the method

// We're done with the call
0000002b  nop                                // Do nothing (breakpoint helper)
0000002c  mov         esp,ebp                // Restore stack
0000002e  pop         ebp                    // Epilogue
0000002f  ret                                // Return

This was from an optimized compilation of the code. Clearly, there's an address of a variable being passed, and not "the variable itself".

Community
  • 1
  • 1
user541686
  • 205,094
  • 128
  • 528
  • 886
  • 2
    I don't think an extra lookup is involved. – H H Aug 24 '11 at 07:33
  • @Henk: Agreed. On the contrary: Not passing by ref will create a new pointer under the hood. – Daniel Hilgarth Aug 24 '11 at 07:37
  • 4
    @Daniel: There's a pointer copied *in **both cases***. It's just whether it's a direct pointer or a pointer to another pointer that's the issue -- and when you use `ref`, it's the latter, which involves an extra memory lookup (unless the JIT optimizes it by proving it unnecessary). – user541686 Aug 24 '11 at 07:38
  • @Mehrdad Can you explain why the extra lookup? Theoretically we're creating a new 'pointer' somewhere when not using ref, isn't it? – Mister Smith Aug 24 '11 at 07:41
  • @Mister: I *just* explained it in my last comment. When you use `ref`, you're taking the *address* of the reference -- which is another pointer. So to get to the original object, you will have to go through **two** pointers instead of one, each time. – user541686 Aug 24 '11 at 07:42
  • 1
    @Mehrdad: Not correct. Reference types are just pointers under the hood. When not passing by ref, you create a new pointer. My answer proves that. That is, not passing by ref will require sizeof(pointer) (normally 32 bit on a 32 bit system) more memory than the version with by ref. – Daniel Hilgarth Aug 24 '11 at 07:43
  • @Daniel: Really? If so, what do you think happens when you ***do*** pass by reference? – user541686 Aug 24 '11 at 07:44
  • 3
    @Mehrdad: When passing by ref you are not passing a pointer to a pointer, but you are passing THE only pointer to your object as opposed by creating a new one when not passing by ref. – Daniel Hilgarth Aug 24 '11 at 07:44
  • @Daniel: I'd hate to say this, but you don't seem to be very familiar with how pass-by-reference works. (It's just syntactic sugar for pass-by-pointer, which involves taking an address.) Either that, or you're implicitly assuming that the JIT will always optimize the case, which isn't a great assumption and which is invalid for many cases. – user541686 Aug 24 '11 at 07:45
  • @Mehrdad: I inferred what I said from C++. It is a sensible assumption that it works the same way in C#. Please provide prove or arguments that this is not the case. – Daniel Hilgarth Aug 24 '11 at 07:46
  • @Daniel: It's the **same** way in C++ **without** optimizations, but the compiler has a *lot* more leeway in optimizing it away (e.g. undefined behavior allows for **lots** of optimizations in C++ that can't happen in .NET) and just re-using the register itself. Not quite as true in .NET, because of things like reflection, etc. which make it a lot harder to create an "optimized" custom calling convention for a similar piece of code -- assumptions don't carry across languages like that. – user541686 Aug 24 '11 at 07:48
  • can you please explain the x86 assembly with more comments ? I am not familiar with x86. I really want to see how things are done under the hood – emre nevayeshirazi Aug 24 '11 at 08:13
  • @nevayeshirazi: I put in explanations. Part of the disassembly is runtime-specific and is unrelated to the method call itself. The method call clearly involves copying the variable (i.e. Object reference) to memory and taking its address again (which is a pointer to the reference), which means there's another indirection on top of the original. – user541686 Aug 24 '11 at 08:26
  • @Mehrdad good job. I don't understand well how some of the assembly code lines are related to the original code. But just for curiosity, how did you get the final executable, isn't C# interpreted? Or is there a tool to dissasembly in VisualStudio? (noob question, I know XD). I'd like to do the same test without ref. – Mister Smith Aug 24 '11 at 09:34
  • @Mister: It's the latter -- if you right-click a stack frame in Visual Studio and click Go to Assembly, it shows you the disassembly right there -- which includes the original code (which I didn't show here). (If you don't see the options, right-click the disassembly and click the Show Source option.) – user541686 Aug 24 '11 at 11:42
  • @Mehrdah: tested and posted in a new answer. Could you explain the difference, if any? It is a JIT optimization? – Mister Smith Aug 25 '11 at 09:06
  • 1
    @Mister: The difference seems to be exactly what I already explained: The second one *copies* the Object reference (it's passed by value), whereas the first one passes a *pointer* to the Object reference (it's passed by reference, which is passing by pointer). So the first case requires **two** pointer lookups by the callee, whereas the second one only requires one (since it already has a copy of the original pointer, and doesn't need to look it up). – user541686 Aug 25 '11 at 09:13
  • @Mehrdah I don't understand why the LEA instruction is less efficient than the MOV one in this case. – Mister Smith Aug 25 '11 at 09:26
  • 1
    @Mister: The *instruction* isn't faster. What *is* faster is when the **callee** wants to **access** the original object, because it has to go through fewer lookups when it **already has** the Object reference (MOV case) as opposed to when it has to look it up (LEA case). – user541686 Aug 25 '11 at 09:29
  • @Mehrdah So LEA loads temp address's address in a register, while MOV moves temp's address, and the function (not shown) in the first case will have to access memory 2 times, isn't it? – Mister Smith Aug 25 '11 at 09:36
8

DISSASEMBLER VIEW OF Mehrdad's example (BOTH VERSIONS)

I'll try to dig a little deeper on Mehrdad's nice proof, for those like me that are not very good reading assembly code. This code can be captured in Visual Studio when we're debbuging, clicking Debug -> Windows -> Dissasembly.

VERSION USING REF

Source Code:

 namespace RefTest
 {
    class Program
    {
        static void Test(ref object o) { GC.KeepAlive(o); }

        static void Main(string[] args)
        {
            object temp = args;
            Test(ref temp);
        }
    }
 }

Assembly language (x86) (only showing the part that differs):

             object temp = args;
 00000030  mov         eax,dword ptr [ebp-3Ch] 
 00000033  mov         dword ptr [ebp-40h],eax 
             Test(ref temp);
 00000036  lea         ecx,[ebp-40h] //loads temp address's address on ecx? 
 00000039  call        FD30B000      
 0000003e  nop              
         }  

VERSION WITHOUT REF

Source Code:

 namespace RefTest
 {
    class Program
    {
        static void Test(object o) { GC.KeepAlive(o); }

        static void Main(string[] args)
        {
            object temp = args;
            Test(temp);
        }
    }
 }

Assembly language (x86) (only showing the part that differs):

             object temp = args;
 00000035  mov         eax,dword ptr [ebp-3Ch] 
 00000038  mov         dword ptr [ebp-40h],eax 
             Test(temp);
 0000003b  mov         ecx,dword ptr [ebp-40h] //move temp address to ecx?
 0000003e  call        FD30B000 
 00000043  nop              
         }

Apart from the commented line, the code is the same for both versions: with ref, the call to the function is preceded by a LEA instruction, without ref we've a simpler MOV instruction. After executing this line, LEA has loaded the ecx register with a pointer to a pointer to the object, whereas MOV has loaded ecx with a pointer to the object. This means that the FD30B000 subroutine (pointing to our Test function) in the first case will have to make an extra access to memory to get to the object. If we inspect the assembly code for each produced version of this function, we can see that at some point (in fact the only line that differs between the two versions) the extra access is made:

static void Test(ref object o) { GC.KeepAlive(o); }
...
00000025  mov         eax,dword ptr [ebp-3Ch] 
00000028  mov         ecx,dword ptr [eax]
...

While the function without ref can go straight to the object:

static void Test(object o) { GC.KeepAlive(o); }
...
00000025  mov         ecx,dword ptr [ebp-3Ch]
...

Hope it helped.

Mister Smith
  • 27,417
  • 21
  • 110
  • 193
5

Yes, there is a reason: If you want to reassign the value. There is no difference in value types and reference types in that regards.

See the following example:

class A
{
    public int B {get;set;}
}

void ReassignA(A a)
{
  Console.WriteLine(a.B);
  a = new A {B = 2};
  Console.WriteLine(a.B);
}

// ...
A a = new A { B = 1 };
ReassignA(a);
Console.WriteLine(a.B);

This will output:

1
2
1

Performance however has nothing to do with it. This would be real micro optimization.

Daniel Hilgarth
  • 171,043
  • 40
  • 335
  • 443
-1

Passing reference type by value does not copy the object. It only creates new reference to existing object. So you should not pass it by reference unless you really need to.

emre nevayeshirazi
  • 18,983
  • 12
  • 64
  • 81