1

Is there a way to deserialize a bytes field to a Stream member, without protobuf-net allocating a new (and potentially big) byte[] upfront?

I'm looking for something like this:

[ProtoContract]
public class Message
{
    [ProtoMember(1)]
    Stream Payload { get; set; }
}

Where the Stream could be backed by a pre-allocated buffer pool e.g. Microsoft.IO.RecyclableMemoryStream. Even after dropping down to ProtoReader for deserialization all I see is AppendBytes, which always allocates a buffer of field length. One has to drop even further to DirectReadBytes, which only operates directly on the message stream -- I'd like to avoid that.

As background, I'm using protobuf-net to serialize/deserialize messages across the wire. This is a middle-layer component for passing messages between clients, so the messages are really an envelope for an enclosed binary payload:

message Envelope {
  required string messageId = 1;
  map<string, string> headers = 2;
  bytes payload = 3;
}

The size of payload is restricted to ~2 MB, but large enough for the byte[] to land in the LOH.

Using a surrogate as in Protobuf-net: Serializing a 3rd party class with a Stream data member doesn't work because it simply wraps the same monolithic array.

One technique that should work is mentioned in Memory usage serializing chunked byte arrays with Protobuf-net, changing bytes to repeated bytes and relying on the sender to limit each chunk. This solution may be good enough, it'll prevent LOH allocation, but it won't allow buffer pooling.

Community
  • 1
  • 1
browe
  • 198
  • 1
  • 6
  • Not in c#. Serialization requires entire data to be in an object for standard serialization to work. You could write a custom serializer, or write code in c++. – jdweng Jun 17 '16 at 23:50
  • C# itself shouldn't be an issue, I can definitely do it with custom serialization by reading a chunk from the message stream and writing to the payload-field stream. On the other hand, I was hoping to leverage the excellent protobuf-net library as much as possible. – browe Jun 22 '16 at 16:02
  • Why can't you wait for the entire buffer to be received before serialization? – jdweng Jun 22 '16 at 16:16
  • I'm fine with receiving the entire message, as it is buffered in a stream that divides it amongst small byte[], what I wanted to avoid was protobuf-net's allocation of a monolithic byte[] for one of the blob fields in the message. – browe Jun 28 '16 at 17:37

1 Answers1

2

The question here is about the payload field. No, there is not current a mechanism to handle that, but it is certainly something that could be investigated for options. It could be that we can do something like an ArraySegment<byte> AllocateBuffer(int size) callback on the serialization-context that the caller could use to take control of the allocations (the nice thing about this is that protobuf-net doesn't actually work with ArraySegment<byte>, so it would be a purely incremental change that wouldn't impact any existing working code; if no callback is provided, we would presumably just allocate a flat buffer like it does currently). I'm open to other suggestions, but right now: no - it will allocate a byte[] internally.

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • I think your proposal would work Marc. The ArraySegments would be handed out from a LOH-allocated pool and returned when the wrapping type is disposed. – browe Jul 15 '16 at 15:25
  • wondering if this is supported yet? I too have similar question on this. – Rohit Sharma Nov 10 '19 at 13:45
  • 1
    @Rohit it isn't something I've looked into, and Stream is especially problematic (for deserialization, knowing which to choose, rewinding, etc); however, with the 3.0 bits, it would probably be quite easy to support `Memory` or `ReadOnlySequence` - in fact I have some examples of that using custom allocators, for example the "arenas" allocator from another of my libraries. Any use? – Marc Gravell Nov 10 '19 at 21:46
  • Yes Memory would be lovely – Rohit Sharma Nov 11 '19 at 12:20