25

How do you choose between implementing a value object (the canonical example being an address) as an immutable object or a struct?

Are there performance, semantic or any other benefits of choosing one over the other?

Garry Shutler
  • 32,260
  • 12
  • 84
  • 119

11 Answers11

17

There are a few things to consider:

A struct is allocated on the stack (usually). It is a value type, so passing the data around across methods can be costly if it is too large.

A class is allocated on the heap. It is a reference type, so passing the object around through methods is not as costly.

Generally, I use structs for immutable objects that are not very large. I only use them when there is a limited amount of data being held in them or I want immutability. An example is the DateTime struct. I like to think that if my object is not as lightweight as something like a DateTime, it is probably not worth being used as a struct. Also, if my object makes no sense being passed around as a value type (also like DateTime), then it may not be useful to use as a struct. Immutability is key here though. Also, I want to stress that structs are not immutable by default. You have to make them immutable by design.

In 99% of situations I encounter, a class is the proper thing to use. I find myself not needing immutable classes very often. It's more natural for me to think of classes as mutable in most cases.

Dan Herbert
  • 99,428
  • 48
  • 189
  • 219
  • One of the benefits you missed is that structs enjoy pass-by-value semantics. (Objects are pass-by-reference, while object-references are pass-by-value, for all you finicky people out there.) – yfeldblum Feb 23 '09 at 03:08
  • That was the first point I implied, when I said "[a struct] is a value type" and "[a class] is a reference type". – Dan Herbert Feb 23 '09 at 03:56
  • @DanHerbert : Re: "A struct is allocated on the stack (usually). ". That "usually" is confusing! A Struct is always allocated on a stack. – Manish Basantani Jul 09 '12 at 07:06
  • @Amby If a struct is used as a property on an object, it is allocated to the heap, not to the call stack which is what most people think of when referring to "the stack". – Dan Herbert Jul 09 '12 at 11:04
  • @DanHerbert: Agree. But then we are talking about a "boxed" value of a struct, which gets converted to a reference type and hence gets allocated in heap. The original value of a struct "always" gets allocated on a stack, right? – Manish Basantani Jul 10 '12 at 03:56
  • 2
    @Amby read this article. It should help you understand. http://blogs.msdn.com/b/ericlippert/archive/2010/09/30/the-truth-about-value-types.aspx Structs don't need to be boxed to get allocated to the heap. – Dan Herbert Jul 10 '12 at 11:30
  • @DanHerbert: I accept my mistake!. Thanks a lot for sharing that post, it is really an eye-opener. Thanks. – Manish Basantani Jul 20 '12 at 09:13
  • @yfeldblum - What benefit do I get from pass-by-value semantics? If it's because the original object won't be modified - I could just create an immutable class and get the same effect? – BornToCode Aug 11 '16 at 08:31
  • 1
    @BornToCode - It requires no allocation to create or dereferencing to use. It has cache locality. Copies can be optimized using SSE (`memcpy` may be optimized to use SSE automatically). – yfeldblum Aug 12 '16 at 22:53
14

I like to use a thought experiment:

Does this object make sense when only an empty constructor is called?

Edit at Richard E's request

A good use of struct is to wrap primitives and scope them to valid ranges.

For example, probability has a valid range of 0-1. Using a decimal to represent this everywhere is prone to error and requires validation at every point of usage.

Instead, you can wrap a primitive with validation and other useful operations. This passes the thought experiment because most primitives have a natural 0 state.

Here is an example usage of struct to represent probability:

public struct Probability : IEquatable<Probability>, IComparable<Probability>
{
    public static bool operator ==(Probability x, Probability y)
    {
        return x.Equals(y);
    }

    public static bool operator !=(Probability x, Probability y)
    {
        return !(x == y);
    }

    public static bool operator >(Probability x, Probability y)
    {
        return x.CompareTo(y) > 0;
    }

    public static bool operator <(Probability x, Probability y)
    {
        return x.CompareTo(y) < 0;
    }

    public static Probability operator +(Probability x, Probability y)
    {
        return new Probability(x._value + y._value);
    }

    public static Probability operator -(Probability x, Probability y)
    {
        return new Probability(x._value - y._value);
    }

    private decimal _value;

    public Probability(decimal value) : this()
    {
        if(value < 0 || value > 1)
        {
            throw new ArgumentOutOfRangeException("value");
        }

        _value = value;
    }

    public override bool Equals(object obj)
    {
        return obj is Probability && Equals((Probability) obj);
    }

    public override int GetHashCode()
    {
        return _value.GetHashCode();
    }

    public override string ToString()
    {
        return (_value * 100).ToString() + "%";
    }

    public bool Equals(Probability other)
    {
        return other._value.Equals(_value);
    }

    public int CompareTo(Probability other)
    {
        return _value.CompareTo(other._value);
    }

    public decimal ToDouble()
    {
        return _value;
    }

    public decimal WeightOutcome(double outcome)
    {
        return _value * outcome;
    }
}
Community
  • 1
  • 1
Bryan Watts
  • 44,911
  • 16
  • 83
  • 88
  • I don't understand. You can create objects that don't have a default constructor. – Garry Shutler Feb 22 '09 at 22:42
  • 1
    Structs *always* have a default constructor, even if you don't define one. Therefore, a struct can always be instantiated to an "empty" instance (such as new Int32()). If a the object doesn't make sense without a particular constructor, it should probably be an immutable class. – Bryan Watts Feb 22 '09 at 22:44
  • In C# classes also have a default constructor, even if none is declared. – Richard Ev Feb 22 '09 at 22:54
  • What I meant is that the default constructor can be hidden (made private for example) if required. – Garry Shutler Feb 22 '09 at 22:57
  • 2
    But if you declare a non-default constructor in a class, the default one goes away and you can *only* use the non-default. With structs, the default is always there and cannot be removed. – Bryan Watts Feb 22 '09 at 22:57
  • 1
    @Garry Shutler: you cannot declare a default constructor in a struct (and therefore can't set its visibility). There is *no way* to prevent someone from using a default constructor with a struct. – Bryan Watts Feb 22 '09 at 22:58
  • I was replying to Richard E with regard to the default constructors for classes but I didn't make that clear. I realise structs have to have a default constructor. – Garry Shutler Feb 22 '09 at 23:02
  • True, the default constructor for a class is removed if you declare a parameterised constructor. I'm not sure what that proves about the struct vs class discussion though. – Richard Ev Feb 22 '09 at 23:03
  • The point I made is that since structs always have default constructors, and that makes them different from classes, contemplating that difference and its connotations is a good way to think about the decision. If a "zeroed-out" instance makes sense, a struct is a possibility. – Bryan Watts Feb 22 '09 at 23:11
14

How do you choose between implementing a value object (the canonical example being an address) as an immutable object or a struct?

I think your options are wrong. Immutable object and struct are not opposites, nor are they the only options. Rather, you've got four options:

  • Class
    • mutable
    • immutable
  • Struct
    • mutable
    • immutable

I argue that in .NET, the default choice should be a mutable class to represent logic and an immutable class to represent an entity. I actually tend to choose immutable classes even for logic implementations, if at all feasible. Structs should be reserved for small types that emulate value semantics, e.g. a custom Date type, a Complex number type similar entities. The emphasis here is on small since you don't want to copy large blobs of data, and indirection through references is actually cheap (so we don't gain much by using structs). I tend to make structs always immutable (I can't think of a single exception at the moment). Since this best fits the semantics of the intrinsic value types I find it a good rule to follow.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • 1
    What do you mean by "represent logic"? Could you give an example? Many entities (such as a person) are not defined by there properties so why do you argue they should be immutable? – gkdm Oct 07 '09 at 15:01
  • 1
    @Avid: int (System.Int32), bool (System.Boolean), double (System.Double), etc are all immutable. Unlike in old versions of FORTRAN, you cannot change the value of `3` in C#. If you have `readonly int x = 3;`, `x` will always have the value 3, and you can't change that value. However, if you have `struct Point { int x; int y; }` and `readonly Point a = new Point();` you can change the value contained in a: `a.x = 42;`. – R. Martinho Fernandes Aug 30 '10 at 23:13
  • @Martinho, in that case the immutability is a result of the "readonly", not intrinsic to int. In fact, if it was 'int x = 3;' you can absolutely change the value of 'x'. '3' on the other hand, is a constant int, not a regular variable. – AviD Sep 01 '10 at 12:12
  • @Avid: Notice I used readonly on *both* cases. Yet they differ. I never reassigned the variable `a` in the second example. I simply *mutated* it. You acknowledged yourself that strings are immutable. So if I have `string s = "foo";` I can absolutely change the value of `s`, and that does not make strings mutable. 3 is immutable. Ints are immutable. Strings are immutable. Mutable *variables* are mutable. Immutable *variables* (aka readonly) are immutable. Somewhat related: http://devnonsense.blogspot.com/2009/11/immutable-data-is-thread-safe.html – R. Martinho Fernandes Sep 01 '10 at 14:01
  • @Martinho, I think we're talking at different levels of indirection: Wrt strings, if you change the value of s, the old string object is *thrown out* (and then GC'd), while a *completely new instance of System.String* is created for you with the new value. So, yes, you changed the "value" of s only insomuch as you're talking about the reference - i.e. you changed the address that s is now pointing to, but the original string instance *was not changed*. This is not the case with a System.Integer32 instance, where `int i = 1; i = 3;` keeps the original instance, changing only *its value*. – AviD Sep 01 '10 at 21:20
  • 1
    @Avid: Why is the case with int different? You can see `int i = 1; i = 3;` as throwing the 1 out and replacing it with an instance of the number 3. Think about `int i = 1; i = new int();`. Is it changing the existing value, or replacing it with a new instance of System.Int32? The fact that you can assign values to variables doesn't make a type immutable or mutable. It's the (in)existence of mutable fields and/or mutator methods, which neither string nor int have. Also, MSDN states clearly that Int32 is an immutable type. – R. Martinho Fernandes Sep 02 '10 at 12:56
  • @AviD You are confusing immutability of the value with immutability of the variable containing the value. Yes, you can mutate variables of type integer. But you can't mutate the underlying integers. You can't increase the value of 3. Conversely, you can both mutate variables of type list and mutate the underlying list. – Craig Gidney Sep 20 '10 at 13:37
  • @Strilanc, it is you and @Martinho that are confusing it. Or it might just be we have different meanings for "immutability". For me, if an object is immutable, you cannot change its value in-place. If you try to change the value, copy-on-write semantics are used, to replace the object with a new object, and change the original variable to refer to the new object. This doesnt have to be through mutable fields or mutator methods, it can be just a change to the *value* of the object itself. – AviD Sep 20 '10 at 21:45
  • 3
    @AviD: This discussion is moot. You are confusing value with symbol. These have a very well-defined meaning in computer science, shame on Microsoft for being sloppy with terminology. But the fact is: a variable is a *symbol*, and unless you declare that `const` or `readonly` in .NET, a variable is always mutable. A value, on the other hand, is something completely different. And values in .NET are immutable for all basic types. You cannot change a value, end of story. All you can do is assign a new value to an existing symbol. (continued …) – Konrad Rudolph Sep 21 '10 at 08:17
  • @Avid: (cont’d) You admitted yourself that strings are immutable. Well, how does `s += "foo"` differ conceptually from `i += 1`? Not at all, that’s how. In both cases you read the value of the symbol/variable, throw it out and replace it with a new symbol. The symbol is mutable in both cases, the value is not. – Konrad Rudolph Sep 21 '10 at 08:19
  • 1
    @Konrad, it's not the value OR the symbol that is mutable or not, it's the *object*. `s += "foo"` is *very* different from `i+= 1` when you realize that it's not s or i you want to change, but the *object* it's referring to. For s, you can't, so you have to assign the variable to refer to a new object. For i, you assign the same object a new value, and therefor you dont need to change i's reference. (continued...) – AviD Sep 21 '10 at 18:52
  • (contd) Where I think we are miscommunicating, is close to what you are saying the confusion, but allow me to clarify: When we say "variable", are we talking about the symbol, or the object to which it refers? Well, truthfully, usually depends on the context, which could lead to some confusion. For my part, in this discussion, I was referring to the symbol. I don't think MS were sloppy here... However, there are some situations, such as the int above (and I was purposefully obtuse in my previous comment), where they are one and the same. But my point still stands. – AviD Sep 21 '10 at 18:55
  • 1
    @AviD: The difference that you want to see between `s += "foo"` and `i += 1` doesn’t exist: in both cases it’s *not* the object that you are mutating, it’s the variable (= the symbol). For integers, there is no way of seeing the difference since they are value types and thus no two symbols can refer to the *same* object anyway (thus we cannot observe a different behaviour). But conceptually, Microsoft makes it very clear that the basic value types (comprising `int`) are immutable. Note that this is different from, say, C++ where you *can* mutate basic types (and you *can* see that difference). – Konrad Rudolph Sep 22 '10 at 08:07
  • 1
    @Konrad - right, that's what I meant that with int they are one and the same, and thus int (and other basic valuetypes) must be immutable. With strings, it should be different, since they are references - but that's why they are considered immutable, even though the mechanics underneath are very different. Truthfully, I don't even remember what we disagreed about, since we seem to be in agreement (if from a different PoV), though it probably started from some misunderstanding of mine... Thanks. – AviD Sep 26 '10 at 08:19
6

Factors: construction, memory requirements, boxing.

Normally, the constructor restrictions for structs - no explicit parameterless constructors, no base construction - decides if a struct should be used at all. E.g. if the parameterless constructor should not initialize members to default values, use an immutable object.

If you still have the choice between the two, decide on memory requirements. Small items should be stored in structs especially if you expect many instances.

That benefit is lost when the instances get boxed (e.g. captured for an anonymous function or stored in a non-generic container) - you even start to pay extra for the boxing.


What is "small", what is "many"?

The overhead for an object is (IIRC) 8 bytes on a 32 bit system. Note that with a few hundred of instances, this may already decide whether or not an inner loop runs fully in cache, or invokes GC's. If you expect tens of thousands of instances, this may be the difference between run vs. crawl.

From that POV, using structs is NOT a premature optimization.


So, as rules of thumb:

If most instances would get boxed, use immutable objects.
Otherwise, for small objects, use an immutable object only if struct construction would lead to an awkward interface and you expect not more than thousands of instances.

peterchen
  • 40,917
  • 20
  • 104
  • 186
3

I actually don't recommend using .NET structs for Value Object implementation. There're two reasons:

  • Structs don't support inheritance
  • ORMs don't handle mapping to structs well

Here I describe this topic in detail: Value Objects explained

Vladimir
  • 1,630
  • 2
  • 18
  • 28
2

In today's world (I'm thinking C# 3.5) I do not see a need for structs (EDIT: Apart from in some niche scenarios).

The pro-struct arguments appear to be mostly based around perceived performance benefits. I would like to see some benchmarks (that replicate a real-world scenario) that illustrate this.

The notion of using a struct for "lightweight" data structures seems way too subjective for my liking. When does data cease to be lightweight? Also, when adding functionality to code that uses a struct, when would you decide to change that type to a class?

Personally, I cannot recall the last time I used a struct in C#.

Edit

I suggest that the use of a struct in C# for performance reasons is a clear case of Premature Optimization*

* unless the application has been performance profiled and the use of a class has been identified as a performance bottleneck

Edit 2

MSDN States:

The struct type is suitable for representing lightweight objects such as Point, Rectangle, and Color. Although it is possible to represent a point as a class, a struct is more efficient in some scenarios. For example, if you declare an array of 1000 Point objects, you will allocate additional memory for referencing each object. In this case, the struct is less expensive.

Unless you need reference type semantics, a class that is smaller than 16 bytes may be more efficiently handled by the system as a struct.

Richard Ev
  • 52,939
  • 59
  • 191
  • 278
  • 2
    Have you used an Int32 or DateTime lately? Those are pretty good reasons to have a struct :-) "Class vs struct" is the same concept as "entity vs value", expressed in language terms. The difference is around identity, *not* perceived performance benefits. – Bryan Watts Feb 22 '09 at 23:08
  • Never define your own struct, and you will live a longer and happier life. Always uses classes (except for interop field layout). – Brian Feb 22 '09 at 23:21
  • Bryan - can you clarify which point above you are referring to in your comment? – Richard Ev Feb 22 '09 at 23:29
  • Bryan - my point was twofold: 1) Some of the respondents cited performance as a reason for using a struct 2) Others suggested that it should be used for lightweight constructs. I feel that both of these viewpoints are very subjective and would benefit from clarification. – Richard Ev Feb 22 '09 at 23:32
  • @Richard E: I was referring to @Brian's comment. I decided to remove it since it was so easily misinterpreted. @Brian: while that makes life less complicated, it also removes a powerful tool from your toolbox. – Bryan Watts Feb 22 '09 at 23:39
  • I agree with both points, that performance should not be a primary factor. Structs have their place when used correctly; they simply require diligence. My favorite use is for numbers with valid ranges outside those of the language type. For example, probability is 0-1, a percentage is 0-100, etc. – Bryan Watts Feb 22 '09 at 23:40
  • 2
    premature optimization is evil only if you compromise anything else for it - e.g. readability, interface elegance, development time. – peterchen Feb 23 '09 at 00:11
2

In general, I would not recommend structs for business objects. While you MIGHT gain a small amount of performance by heading this direction, as you are running on the stack, you end up limiting yourself in some ways and the default constructor can be a problem in some instances.

I would state this is even more imperative when you have software that is released to the public.

Structs are fine for simple types, which is why you see Microsoft using structs for most of the data types. In like manner, structs are fine for objects that make sense on the stack. The Point struct, mentioned in one of the answers, is a fine example.

How do I decide? I generally default to object and if it seems to be something that would benefit from being a struct, which as a rule would be a rather simple object that only contains simple types implemented as structs, then I will think it through and determine if it makes sense.

You mention an address as your example. Let's examine one, as a class.

public class Address
{
    public string AddressLine1 { get; set; }
    public string AddressLine2 { get; set; }
    public string City { get; set; }
    public string State { get; set; }
    public string PostalCode { get; set; }
}

Think through this object as a struct. In the consideration, consider the types included inside this address "struct", if you coded it that way. Do you see anything that might not work out the way you want? Consider the potential performance benefit (ie, is there one)?

Gregory A Beamer
  • 16,870
  • 3
  • 25
  • 32
  • Applying the thought experiment from my answer to your example: does an address make sense if all its fields are null? I would say no in this case. – Bryan Watts Feb 22 '09 at 23:46
  • Realistically, with data objects, you do not create an object until at least one item is not null, but I would agree that an object with 100% nulls is invalid. The main point I was making is a struct is not the best method to code an address. – Gregory A Beamer Feb 25 '09 at 16:30
0

What is the cost of copying instances if passed by value.

If high, then immutable reference (class) type, otherwise value (struct) type.

Richard
  • 106,783
  • 21
  • 203
  • 265
0

As a rule of thumb a struct size should not exceed 16 bytes, otherwise passing it between methods may become more expensive that passing object references, which are just 4 bytes (on a 32-bit machine) long.

Another concern is a default constructor. A struct always has a default (parameterless and public) constructor, otherwise the statements like

T[] array = new T[10]; // array with 10 values

would not work.

Additionally it's courteous for structs to overwrite the == and the != operators and to implement the IEquatable<T> interface.

Michael Damatov
  • 15,253
  • 10
  • 46
  • 71
0

From an object modeling perspective, I appreciate structs because they let me use the compiler to declare certain parameters and fields as non-nullable. Of course, without special constructor semantics (like in Spec#), this is only applicable to types that have a natural 'zero' value. (Hence Bryan Watt's 'though experiment' answer.)

Av Pinzur
  • 2,188
  • 14
  • 14
-2

Structs are strictly for advances users ( along with out and ref) .

Yes structs can give great performance when using ref but you have to see what memory they are using. Who controls the memory etc.

If your not using ref and outs with structs they are not worth it , if you are expect some nasty bugs :-)

ben
  • 223
  • 2
  • 2