6

Heyup. Long time lover of protobuf.net.

Quick question though. I have a highly multithreaded C# application which is deserialising perhaps 100 objects / second, amounting to about 50MB/sec. I am seeing very large memory usage, well over and above that which I am deserialising. I've run the application through the 'Red Gate ANTS Memory Profiler' and it's showing me a massive amount of Generation 2 memory objects due to protobuf (Over 50% of application usage). Most objects are int values and are linked with:

- TypeModel.TryDeserializeList()
- ProtoBuf.Meta.BasicList

Any help reducing this gen 2 memory usage would be appreciated.

Marc

MarcF
  • 3,169
  • 2
  • 30
  • 57
  • Interesting - can you give an indication of roughly what the model looks like? Is it an array of ints? A list-of-int? Also: what framework is this? Full .NET? CF? SL? Can probably address this easily, but need more context to be sure. – Marc Gravell Dec 09 '11 at 19:15
  • 1
    Also - is the **root** object the list? Basically... Can haz codez? (or at least, something similar-ish) – Marc Gravell Dec 09 '11 at 19:29
  • ok. Apologies for the lack of details. What you do mean by "what the model looks like?". The largest object I'm deserialising is an int[33554432] array using .net 4.0. – MarcF Dec 11 '11 at 00:20
  • Unfortunately there is no specific code fragment I can offer which I know to be the cause of the problem. All of the information the memory profiler seems to be giving me is that there are a v.large number of in values in both Gen 2 and Large Object Heap somehow associated with ProtoBuf.Meta.BasicList. It's also worth adding that when all the deserialisation is complete and I call a manual garbage collection the memory usage of the application drops to 20% of what it was while deserialising. Is this just the expected memory usage of protobuf when deserialising such a large int array? – MarcF Dec 11 '11 at 00:21
  • The main thing I want to know is... What is the T that is used in `Deserialize` - is it `Deserialize` ? – Marc Gravell Dec 11 '11 at 09:15

2 Answers2

5

It sounds to me that the root T here is the array itself, i.e.

int[] values = Serializer.Deserialize<int[]>(source);

If that is the case, then currently it uses a slightly sub-optimal path for that scenario (for the reason of: using the same code-path even on platforms that have weak meta-programming/reflection models, such as iOS). I will try to spend a few hours tidying that at some point, but in answer to your question - you should be able to avoid the issue here simply by adding a parent object:

[ProtoContract]
public class MyDataWrapper { // need a new name...
    [ProtoMember(1)]
    public int[] Values { get;set; }
}

and then:

int[] values = Serializer.Deserialize<MyDataWrapper>(source).Values;

This is actually fully compatible with data already serialized via Serialize<int[]>, as long as the field-number used is 1. One additional benefit of this approach is that if desired you could use the "packed" sub-format (only available for lists/arrays of primitives such as int); although maybe that still isn't a great idea in this case due to the large length (it may require buffering when serialising).


Additional context; "v1" here basically uses MakeGenericType to switch into to something like the above on the fly; however since this approach is not available in many of the additional platforms that "v2" targets, it uses a less elegant approach here. But now tht it is pretty stable there, I could re-add the optimised version when running on full .NET 2.0 or above.

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • Yeap, that was the problem. Adding the wrapper and nothing else has completely removed the large memory usage. Many thanks again Marc. – MarcF Dec 11 '11 at 12:24
  • @MarcF k; I'll see if I can re-add the optimisation, for future usage – Marc Gravell Dec 11 '11 at 12:54
  • @MarcGravell I know this is an old question, but I ran into the same GC issue. Could you elaborate a bit on the work-around (using a wrapper class): Will this work for all array types (ie. ComplexType[] as the root object, where ComplexType may be modelled using inheritance)? Are the wrapped and the array solution completely equivalent (ie. does protobuf simply generate a wrapper class when serializing arrays)? And finally have you looked into fixing the issue in v. 2.1.0 (prelease)? – Mads Ravn Dec 18 '15 at 09:43
  • yes, it will then use a different code path (although the data will be identical); no it hasn't been overhauled; I can't remember off-the-top-of-my-head whether this will significantly impact GC, but it is worth a try. For the outermost type: no, it doesn't simply wrap it iinternally – Marc Gravell Dec 18 '15 at 10:54
  • @MarcGravell Thanks for the quick reply. I added some benchmark results in the answer below. The effect is quite pronounced so it might make sense to special case the code for the appropriate platforms. – Mads Ravn Dec 18 '15 at 13:02
1

To elaborate on Marcs answer I did a quick benchmark of

  • Serialization/deserialization using a wrapper vs using an array.
  • With/without server GC enabled

The benchmark created 100.000 complex objects (1 timespan, 2 doubles, 2 ints, 2 int?s, list of string with between 0 and 4 short elements (1 character)) and repeated the serialization/deserialization process 30 times and measured the total time taken and the number of GC collections that occurred during the run. The results were (running in Release outside of VS)

GC IsServer: False, GC latency: Interactive, GC LOH compaction: Default
Wrapper serialization
Generation 0: 0 collects
Generation 1: 0 collects
Generation 2: 0 collects
Time: 20.363 s
------------------------
Array serialization
Generation 0: 0 collects
Generation 1: 0 collects
Generation 2: 0 collects
Time: 30.433 s
------------------------
Wrapper deserialization
Generation 0: 109 collects
Generation 1: 47 collects
Generation 2: 16 collects
Time: 71.277 s
------------------------
Array deserialization
Generation 0: 129 collects
Generation 1: 57 collects
Generation 2: 19 collects
Time: 89.145 s


GC IsServer: True, GC latency: Interactive, GC LOH compaction: Default
Wrapper serialization
Generation 0: 0 collects
Generation 1: 0 collects
Generation 2: 0 collects
Time: 20.430 s
------------------------
Array serialization
Generation 0: 0 collects
Generation 1: 0 collects
Generation 2: 0 collects
Time: 30.364 s
------------------------
Wrapper deserialization
Generation 0: 4 collects
Generation 1: 3 collects
Generation 2: 2 collects
Time: 39.452 s
------------------------
Array deserialization
Generation 0: 3 collects
Generation 1: 3 collects
Generation 2: 3 collects
Time: 47.546 s

So my conclusion is

  • The wrapper approach benefits both serialization and deserialization (with the latter having a more pronounced effect).
  • The GC collect overhead imposed by array approach is more noticeable when running without server GC. Also note that the GC performance impact is really bad when not running server GC and deserializing on multiple threads (results not included).

Hope someone finds this useful.

(unfortunately the benchmark code depends on internal code, so I cannot post the full code here).

Mads Ravn
  • 619
  • 6
  • 18