Value types are copied by value -- hence the name. So then we must consider at what times a copy must be made of a value. This comes down to analyzing correctly when a particular entity refers to a variable, or a value. If it refers to a value then that value was copied from somewhere. If it refers to a variable then its just a variable, and can be treated like any other variable.
Suppose we have
struct Foo { public int A; public int B; }
Ignore for the moment the design flaws here; public fields are a bad code smell, as are mutable structs.
If you say
Foo f = new Foo();
what happens? The spec says:
- A new eight byte variable
f
is created.
- A temporary eight byte storage location
temp
is created.
temp
is filled in with eight bytes of zeros.
temp
is copied to f
.
But that is not what actually happens; the compiler and runtime are smart enough to notice that there is no observable difference between the required workflow and the workflow "create f
and fill it with zeros", so that happens. This is a copy elision optimization.
EXERCISE: devise a program in which the compiler cannot copy-elide, and the output makes it clear that the compiler does not perform a copy elision when initializing a variable of struct type.
Now if you say
f.A = 123;
then f
is evaluated to produce a variable -- not a value -- and then from that A
is evaluated to produce a variable, and four bytes are written to that variable.
If you say
int x = f.A;
then f is evaluated as a variable, A
is evaluated as a variable, and the value of A
is written to x
.
If you say
Foo[] fs = new Foo[1];
then variable fs
is allocated, the array is allocated and initialized with zeros, and the reference to the array is copied to fs
. When you say
fs[0].A = 123;
Same as before. f[0]
is evaluated as a variable, so A
is a variable, so 123 is copied to that variable.
When you say
int x = fs[0].A;
same as before: we evaluate fs[0]
as a variable, fetch from that variable the value of A
, and copy it.
But if you say
List<Foo> list = new List<Foo>();
list.Add(new Foo());
list[0].A = 123;
then you will get a compiler error, because list[0]
is a value, not a variable. You can't change it.
If you say
int x = list[0].A;
then list[0]
is evaluated as a value -- a copy of the value stored in the list is made -- and then a copy of A
is made in x
. So there is an extra copy here.
EXERCISE: Write a program that illustrates that list[0]
is a copy of the value stored in the list.
It is for this reason that you should (1) not make big structs, and (2) make them immutable. Structs get copied by value, which can be expensive, and values are not variables, so it is hard to mutate them.
What makes array indexer return a variable but list indexer not? Is array treated in a special way?
Yes. Arrays are very special types that are built deeply into the runtime and have been since version 1.
The key feature here is that an array indexer logically produces an alias to the variable contained in the array; that alias can then be used as the variable itself.
All other indexers are actually pairs of get/set methods, where the get returns a value, not a variable.
Can I create my own class to behave the same as array in this regard
Before C# 7, not in C#. You could do it in IL, but of course then C# wouldn't know what to do with the returned alias.
C# 7 adds the ability for methods to return aliases to variables: ref
returns. Remember, ref
(and out
) parameters take variables as their operands and cause the callee to have an alias to that variable. C# 7 adds the ability to do this to locals and returns as well.