2

The other day (actually let's say a long long time ago) I was reading the excellent CLR via C# book (4th Edition) which I strongly recommend to anybody who is doing some C# development to better understand the underlying mechanisms and explains the magic behind the scenes.

Since that reading I started to become over-worry about whether I should use structs over classes in my design decisions.

I have a situation that would make a perfect use of the structs (fast communication of temporary data chunks(aka the structures)).

However, I'm still struggling about using the structs in this scenario since I've seen so many rules of the thumb. For instance, the instance must be smaller than 16 bytes or the fact that I'm willing to carry an array or a string (which at least is immutable) as one of the structure fields while the String need somehow to be garbage collected later on, etc.

It seems like that the use of structs is then really more than limited.

Here is below an oversimplified C# 6 example, I guess the answer about memory efficiency is here: no, please don't use struct cause you're going to end up with a lot wasted memory and passing that as argument is going to be really greedy, performance-wise.

public enum ChunkHeader : ushort
{
    Unknown = 0x00,
    Transmitted = 0x01,
    Received = 0x02,
}

public struct Chunk
{
    public static Chunk Empty = new Chunk(ChunkHeader.Unknown, new Byte[0]);

    public Chunk(ChunkHeader header, Byte[] data)
    {
        this.Header = header;
        this.Data = ExceptionHelpers.ThrowIfNull(data, nameof(data))
    }

    // C# 6 read-only >>auto<<-properties        
    public ChunkHeader Header { get; }
    public Byte[] Data { get; }
}

public static class ExceptionHelpers
{
    public static T ThrowIfNull<T>(T parameterValue, String parameterName)
    {
        if ((Object)parameterValue == null)
        {
            throw new ArgumentNullException(parameterName);
        }
        else
        {
            return parameterValue;
        }
    }
}

My question is fairly simple, do you think that the structure above can be memory efficient in terms of Garbage Collection? Or is it still better to go with a class definition considering the underlying reference type field in the structure?

If so, any workarounds or general guidelines for dealing with a lot of chunks of data in order to prevent massive garbage collections?

Please consider that those chunks of data are intended to be used only for a short amount of time.

Natalie Perret
  • 8,013
  • 12
  • 66
  • 129
  • Nitpick, but you can't define a parameterless constructor for a struct in C#. The feature you're using here has not made it into the official language. – Theodoros Chatzigiannakis Sep 04 '15 at 13:56
  • 6
    What _exactly_ is your question? Is it something along the lines of _"What data structure to use for many short-lived strings without having to wait so often for garbage collection"_? – CodeCaster Sep 04 '15 at 13:59
  • 1
    @DStanley No, but read-only *auto*-properties are. Note how there's no setter and no explicit backing field. – Luaan Sep 04 '15 at 14:03
  • @Luaan thanks for your support, maybe D Stanley wanted me to be a bit more accurate, going to add that. – Natalie Perret Sep 04 '15 at 14:06
  • There is no one guideline. The best advise I can give you is to benchmark your code, and optimize hotpaths. Don't fall for the micro-optimizations just because you think it will improve performance. Code, test, fix. – Yuval Itzchakov Sep 04 '15 at 14:12
  • @CodeCaster fixed now, I rephrased it, guess it should be ok now. Let me know if there is anything else that is not clear for you :) – Natalie Perret Sep 04 '15 at 14:12
  • I tend to avoid structs in C#: they have more cons than pros. – Mario Vernari Sep 04 '15 at 14:25
  • @Nitpick, my bad I was coding using MonoDevelop with Mono it seems that "this" C# 6 supports it (or at least, used to support it, that's a while that I haven't updated that laptop it might have been fixed in the meanwhile). – Natalie Perret Sep 04 '15 at 14:25
  • 1
    Yeah, it was included in some release candidate versions in Visual Studio too, but in the end the feature was retracted from the compilers. I suggest you update your toolchain, so you don't accidentally end up with annoying uncompilable code. – Theodoros Chatzigiannakis Sep 04 '15 at 14:29
  • @TheodorosChatzigiannakis agreed. I still remember that period too, when the features were still changing quite often, anyway thanks. – Natalie Perret Sep 04 '15 at 14:44
  • 1
    Have you considered pooling the chunk `Data` arrays? That's a straight forward way to avoid garbage collection if you've measured that to be a problem. (I think that whether the thing containing `Data` is a `struct` or `class` isn't important--it's containing a pointer to a byte array either way.) – 31eee384 Sep 04 '15 at 15:58
  • @31eee384: you're right that's actually the very bottom in this specific situation, true pooling might be more suitable instead scratching my head off bout the pertinence of value types. – Natalie Perret Sep 04 '15 at 16:13

1 Answers1

4

Here is the article you want to read. It is called exactly "Choosing Between Class and Struct".

You know, I think in the example that you have given it makes sense to create the struct. Your Chunk seems to represent a single value, similar to primitive types (int, double, etc.).

As the article itself says, there are three important aspects to keep in mind-

  1. Allocation: Your struct will be allocated on stack while a class will be allocated on heap;
  2. Boxing/Unboxing: If you will make Chunk implement some interface, then this value type will be boxed/unboxed which will have a negative impact on performance. This is not a case with reference types;
  3. Value vs Reference: Struct data will be passed as value while class' object will be passed as reference. Therefore a change made to a struct that was passed will not be reflected in all other copies unlike a referenced type.

To summarize:

√ CONSIDER defining a struct instead of a class if instances of the type are small and commonly short-lived or are commonly embedded in other objects.

X AVOID defining a struct unless the type has all of the following characteristics: It logically represents a single value, similar to primitive types (int, double, etc.). It has an instance size under 16 bytes. It is immutable. It will not have to be boxed frequently.


Regarding two points about AVOIDING structs

16 bytes size guideline - (answer is taken from here): The 16 bytes guideline is just a performance rule of thumb.The point is that because value types are passed by value, the entire size of the struct has to be copied if it is passed to a function, whereas for a reference type, only the reference (4 bytes) has to be copied. A struct might save a bit of time though because you remove a layer of indirection, so even if it is larger than these 4 bytes, it might still be more efficient than passing a reference around. But at some point, it becomes so big that the cost of copying becomes noticeable. And a common rule of thumb is that this typically happens around 16 bytes.

Immutability: The aspect of immutability means that whatever value you are putting in the struct, you will set them only once and they will be final. Like for example if you give the chunk a name, then the name will be permanent for its lifetime. If the chunk is going to have many names throughout its life, you better create a class. Apply same logic to your Data to decide struct vs class.

In all other cases, you should define your types as classes.

Community
  • 1
  • 1
displayName
  • 13,888
  • 8
  • 60
  • 75
  • Thanks displayName for your answer I ve already been through and through the msdn ;-) Overall 1 - Allocation:The stack of the thread, that's perfectly fine 2 - Boxing/Unboxing: No interface so far, I was aware about the boxing behaviour though. The book CLR via C# explained it pretty well with tons of examples 3 - Value vs Reference : In my situation it would be fine too I'm just a bit skeptic about the 16 bytes thing. Let's consider for a while that my struct embeds a string instead of a byte array it does not make it fully immutable? A packet can also be associated to a value, kinda nope? – Natalie Perret Sep 04 '15 at 14:40
  • @EhouarnPerret: It is still fully immutable. When a struct with the string embedded in the struct will be sent to, let's say, another method even then any changes made to string will not stay after the method returns. – displayName Sep 04 '15 at 15:11
  • @EhouarnPerret: The aspect of *immutability* means that whatever value you are putting in the struct, you will set them only once and they will be final. Like for example if you give the chunk a name, then the name will be permanent for its lifetime. If the chunk is going to have many names throughout its life, you better create a class. Apply same logic to your *Data* to decide struct vs class. – displayName Sep 04 '15 at 15:23
  • Thanks for your clarification, it makes much more sense now, it sounds it's more like an empiric rule about the 16 bytes thing. BTW, it's not supposed to be 16 bytes? And not 16 MB (MegaBytes) unless I'm misunderstanding your notation. – Natalie Perret Sep 04 '15 at 15:48
  • @EhouarnPerret: corrected. sorry about that. – displayName Sep 04 '15 at 15:51
  • 3
    If your goal is "carrying" strings, arrays or similar, I don't see any valid reason for using a struct than a class: the referred object (i.e. the string) is placed on the heap in any case. What it differs is where the bunch of "pointers" are placed (and treated). Structs lead often to subtle errors when you try to alter them through properties or methods. – Mario Vernari Sep 04 '15 at 15:54