2

I have a binary serialized object in memory and I want to read it from memory by using pointers (unsafae code) in C#. Please look at the following function which is reading from memory stream.

static Results ReadUsingPointers(byte[] data)
{
    unsafe
    {
        fixed (byte* packet = &data[0])
        {
            return *(Results*)packet;
        }
    }
}

At this return *(Results*)packet; statement i get a compile time exception "Cannot take the address of, get the size of, or declare a pointer to a managed type Results"

Here is my structure

public struct Results
{
    public int Id;
    public int Score;
    public char[] Product;
}

As per my understanding, all properties of my struct are blittable properties, then why I am getting this error, and what should I do if I need to use char[] in my structure?

EDIT-1 Let me explain further (plz note that the objects are mocked)...

Background: I have an array of Results objects, I serialized them using binary serialization. Now, at later stages of my program, I need to de-serialize my data in memory as quickly as possible as the data volume is very large. So I was trying, how unsafe code can help me there.

Lets say if my structure don't include public char[] Product;, I get my data back at reasonably good speed. But with char[] it gives me error(compiler should do so). I was looking to find out a solution that work with char[] in this context.

ak1
  • 387
  • 7
  • 20
  • Can you tell us a bit more about the data you are trying to read from? Otherwise it's hard to tell how to properly process the array data. – floele Apr 21 '14 at 08:37

2 Answers2

2

MSDN says:

Any of the following types may be a pointer type:

  • sbyte, byte, short, ushort, int, uint, long, ulong, char, float, double, decimal, or bool.

  • Any enum type.

  • Any pointer type.

  • Any user-defined struct type that contains fields of unmanaged types only.

So you could define your struct as follows to fix the compiler error:

public struct Results
{
    public int Id;
    public int Score;
    // Don't actually do this though.
    public unsafe char* Product;
}

This way, you can point to the first element of an array.

However, based on your edited question, you need a different approach here.

I have an array of Results objects, I serialized them using binary serialization. Now, at later stages of my program, I need to de-serialize my data in memory as quickly as possible

Usually you would use BinaryFormatter for that purpose. If that is too slow, the question should rather be if serialization can be avoided in the first place.

floele
  • 3,668
  • 4
  • 35
  • 51
  • 1
    The first part of your answer is accurate. But the second part surely contains bad advice. No serialization I have every comes across operates by writing pointers to byte streams. – David Heffernan Apr 21 '14 at 08:20
  • @DavidHeffernan Indeed, serializing that would be quite unusual. This is just a solution for the compiler error. There are certainly other ways to serialize properly, but we might need to know a bit more about the data structure that is being read here. – floele Apr 21 '14 at 08:26
  • My problem with your answer is that it appears to bless using `unsafe char*` when that must be the wrong solution to the problem. I don't like that your answer does that. – David Heffernan Apr 21 '14 at 08:29
  • @DavidHeffernan I adjusted my answer a bit, maybe we'll get some more details from the author to suggest a proper solution. – floele Apr 21 '14 at 08:36
  • I still don't like the suggestion to use `unsafe char*`. A naive reader will think that is a good idea when for sure it is not. Why are you retaining it? – David Heffernan Apr 21 '14 at 08:37
  • @floele, `public unsafe char* Product;` apparently it seems like i get the first character pointed and find the rest by array length but, in general lengths are unknown, how could I find the rest of the chars? – ak1 Apr 21 '14 at 09:30
  • @floele You see, that comment from asif illustrates my point. You've given asif the false hope that `unsafe char*` is a viable solution. Now we'll just waste time down this dead end. – David Heffernan Apr 21 '14 at 09:31
  • @asif I edited my answer a bit. The question that comes to my mind now is whether or not you can possibly avoid serialization or create a situation where only parts of the data need to be deseriialized. Is there an opportunity to optimise your process before it comes down to optimising binary serialization? – floele Apr 21 '14 at 09:49
  • @floele In my case binary serialization is part of my requirements, and BTW serialization is not much crucial, but de-serialization is, because of the nature of my problem i need it as quickly as possible. – ak1 Apr 21 '14 at 10:38
  • @asif: So what about simply using `BinaryFormatter`? How does it perform? Can you choose the method of serialization? – floele Apr 21 '14 at 11:24
  • I have already tried that, but does not meet my requirements. I think that protobuf-net is most efficient, but even that doesn't meet my requirements, as performance is critical in my case. – ak1 Apr 21 '14 at 12:08
  • @asif: "but even that doesn't meet my requirements" - well, if methods certain people thought long and hard about optimising are not good enough for you I doubt you'll be able to come up with a better solution unless you look for optimisations other than the serialization code. Anyway, if you follow the pointer approch, you need to additionally store the array length (as pointed out by David) in order to desirialize properly because otherwise you have no method of knowing how much memory to read. – floele Apr 21 '14 at 12:22
1

You cannot expect that to work.

public struct Results
{
    public int Id;
    public int Score;
    public char[] Product;
}

The char[] array Product is a managed type. Your code attempts to use the type Results*. That is a pointer type. The documentation states that you can declared pointers to any of the following:

  • sbyte, byte, short, ushort, int, uint, long, ulong, char, float, double, decimal, or bool.
  • Any enum type.
  • Any pointer type.
  • Any user-defined struct type that contains fields of unmanaged types only.

Now, your struct clearly matches none of the first three bullets. And does not match the final bullet either because the array is a managed type.

As per my understanding, all properties of my struct are blittable properties.

Yes that is true, but not relevant. You need the members of the struct to be more than blittable.


Even if your code would compile, how would you imagine that it could work? Consider this expression:

*(Results*)packet

How could the compiler turn that into something that would create a new array and copy the correct number of elements of the array? So clearly the compiler has no hope of doing anything useful here and that of course is why the language rejects your code.

I don't think that unsafe code is going to help you here. When you serialize your array you will have to serialize the length, and then the array's content. To deserialize you need to read the length, create a new array of that length, and then read the content. Unsafe code cannot help with that. A simple memory copy of a statically defined type is no use because that would imply that the array's length was known at compile time. It is not.


Regarding your update, you said:

I have an array of Results objects which I serialized using binary serialization.

In order to deserialize you need code that understands the detailed layout of your binary serialization. The code in the question cannot do it.

What you perhaps have not understood yet is that you cannot expect to copy arbitrary blocks of memory, whose lengths are variable and only known at runtime, without something actually knowing those lengths. In effect you are hoping to be able to copy memory without anything in the system knowing how much to copy.

Your attempts to deserialize using an unsafe typecast and memory copy cannot work. You cannot expect any more detailed help without consideration of the binary format of your serialization.

David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • I agree with you, but say I have a defined length e.g. `public char[256] Product;` then any chance I could get values by using pointers or by any other means? – ak1 Apr 21 '14 at 09:36
  • I tried many other ways, but didn't get the desired performance gain, that's why I am looking for a solution that could do it more quickly, I guess using pointers is my only hope. – ak1 Apr 21 '14 at 09:41
  • I think I've answered the question that you asked. – David Heffernan Apr 21 '14 at 09:42
  • Would you please share If you have any alternative in mind that could perform well, Thanks. – ak1 Apr 22 '14 at 07:55
  • I don't know what your performance requirements are. And serialization is not my specialist subject at all. You should perhaps ask a different question. What I tried to do here was answer this question. – David Heffernan Apr 22 '14 at 07:58