1

I'm trying to allocate memory for hundreds of thousands objects to initialize them later from an array of bytes. My goal is to skip memory allocation on each object. That is why I am using C# structs.

Union:

[StructLayout(LayoutKind.Explicit)]
struct HeaderUnion
{
    [FieldOffset(0)]
    public unsafe fixed char Data[8];

    [FieldOffset(0)]
    public HeaderSeq HeaderSeq;
}

HeaderSeq:

[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Unicode)]
struct HeaderSeq
{
    [MarshalAs(UnmanagedType.ByValTStr,SizeConst = 2)]
    public string FirstName;

    [MarshalAs(UnmanagedType.ByValTStr,SizeConst = 2)]
    public string LastName;
}

In the program I want to write:

var bytes = File.ReadAllBytes("image.dat");
var headerUnions = new HeaderUnion[100000];
var size = Marshal.SizeOf<HeaderSeq>();

for (int i = 0; i < 100000; i++)
{
    var positionIndex = i * size;

    unsafe
    {
        fixed (char* charPtr = headerUnions[i].Data)
        {
            Marshal.Copy(bytes, positionIndex, (IntPtr)charPtr, size);
        }
    }
}

However, it gives me the runtime error:

Unhandled exception. System.TypeLoadException: Could not load type 'HeaderUnion' from assembly 'MyProject.Console, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null' because it contains an object field at offset 0 that is incorrectly aligned or overlapped by a non-object field.

If I redefine HeaderSeq to only contain single-char fields instead of string or char[], it works fine.

If I do not include HeaderSeq to the HeaderUnion as a field, it also works fine but I should rewrite my code to utilize Marshal.PtrToStructure:

// It does not suit me, because it allocates a struct on each cycle.
// Instead, I want all the memory to be pre-allocated already in the fixed buffers.
var headerSeq = Marshal.PtrToStructure<HeaderSeq>((IntPtr) charPtr);

I can think of preallocating arrays of char in the managed heap, and storing only indexes in the HeaderSeq. This approach is less elegant for me.

Is my goal achievable at all in C#? How should I define my structs?

Victor Ponamarev
  • 179
  • 2
  • 11
  • *"My goal is to skip memory allocation on each object. That is why I am using C# structs."* then you are going the wrong way about it. What you will end up with is a huge amount of copying between the managed form of the struct (which contains an array) and the unmanaged which is a fixed size. It sounds like instead you should just have an array of normal structs – Charlieface Apr 11 '22 at 00:43
  • @Charlieface, I'm not sure I understand you well without referencing to actual samples of the listing. In the case of _an array of normal structs_ I will have to allocate strings on each cycle of the object's initialization because an array of structs in C# makes array or string field of the stored item to contain `null`. It's not acceptable solution in my scenario. In my proposal, I only copy bytes or chars from managed to unmanaged memory, and then I reuse those bytes with help of the union. What's wrong with this approach? – Victor Ponamarev Apr 11 '22 at 06:49
  • Because in managed memory it's still an array or string, so the same allocation is going to happen, and then you need to allocate the unmanaged memory and copy it. Why don't you just make the struct into four `char` values? Alternatively, use one giant `char` array from which you can take `Span` values out of – Charlieface Apr 11 '22 at 09:26

1 Answers1

0

There is no way to do this with managed types (string), you're going to have to use char[] or char* instead.

On an unrelated note, I'm not sure why but the Data field takes up 16 bytes rather than 8 (according to sizeof on .NET 6 x86). Changing it to a byte array fixes that. There's no practical difference and it would have you allocate less memory.

EDIT: After looking into the second bit a little more, it seems C#'s chars are 16 bit (i.e. Unicode), but automatically get marshalled as 8 bit, which confused me. If you plan on using Unicode, chars are desired.

POBIX
  • 96
  • 1
  • 3
  • It works with `string` if I do not use fixed size buffer union. The `Data` field is of type `char` not bytes. Even if I set it to 128 it's not working. It starts working if I comment HeaderSeq field and then use `Marshal.PtrToStructure` as formulated in the question. – Victor Ponamarev Apr 11 '22 at 06:37