2

I am Serialising a Class containing List fields using Protobuf-net v2. In some cases, the List fields can get quite large and so my program fails with an Out Of Memory exception.

What I am now trying to implement is a process that would write the Proto-buf representation of my object to Disk, but using a buffer, so that I don't overuse my memory. So I started with:

 using (var fs = File.OpenWrite(fullPath))
        {
            using (var bs = new BufferedStream(fs, 4096*2))
            {

                Serializer.Serialize(bs, myObject);
            }
        }

However, same OOM exception gets thrown. I suspect I need to write more code for the buffer, however, I am not even sure that what I am plan to do is a good idea and if any other methods exist, as I imagine, my problem is quite common.

Many thanks in advance.

Sergey
  • 939
  • 1
  • 6
  • 13
  • 1
    Can you show us the exception that is thrown? - the complete `ToString()` representation of it. – dbc Aug 07 '14 at 10:30
  • Not really, because "The function evaluation was disabled because of an out of memory exception", so there is nothing useful in the Exception message. It's the general "An exception of type 'System.OutOfMemoryException' occurred in protobuf-net.dll but was not handled in user code" – Sergey Aug 07 '14 at 10:38
  • Ugh. You might at least be able to get the traceback by using "Debug" -> "Exceptions" -> "Break when an exception is thrown" for System.OutOfMemoryException. But it sounds like the problem is really that you are out of virtual memory rather than that some hash table or list needs to grow past, say, 256k entries and cannot. – dbc Aug 07 '14 at 10:42
  • I see what you mean. I am pretty certain I am not limited by the capacity, but just run out of memory, hence the idea for an intermediate container to store data, then save to disk, get more data, save and so on. – Sergey Aug 07 '14 at 10:45
  • 1
    Rather than doing [`File.OpenWrite`](http://msdn.microsoft.com/en-us/library/system.io.file.openwrite%28v=vs.110%29.aspx) could you use one of the [`FileStream constructors`](http://msdn.microsoft.com/en-us/library/f20d7x6t%28v=vs.110%29.aspx) that allow you to specify a buffer size? The [default FileStream buffer size](http://referencesource.microsoft.com/#mscorlib/system/io/filestream.cs) is 4K though, so I don't think that is the problem. – dbc Aug 07 '14 at 11:15
  • I think the problem is that it stores all of the serialised version of `myObject` in memory, which causes to run out of memory. As you say, FileStream's default allocation is 4k anyway, so it cannot be that. – Sergey Aug 07 '14 at 11:28
  • 1
    StackOverflow suggests this "Related" question, which actually looks related: http://stackoverflow.com/questions/11317045/memory-usage-serializing-chunked-byte-arrays-with-protobuf-net?rq=1 – dbc Aug 07 '14 at 11:35
  • Adding DataFormat = DataFormat.Group as per question fixed it, there's no memory problem anymore, thanks @dbc for bringing the question to my attention. Now it's time to deal with the 6GB file... Although my problem is fixed, part of my question concerns how to achieve serialisation using a buffer. – Sergey Aug 07 '14 at 13:23
  • @Sergey why do you want to use a buffer? – usr Aug 07 '14 at 14:08
  • @usr, well, I may be using word buffer incorrectly here, actually, but, it's coming from the problem of when serialising my object, it seems to store the serialised data entirely in memory and only then writing to disk. The buffer, in my context, would only hold a portion of my serialised object: read portion of data, write to disc, read more and repeat until finished and thus avoiding having to use large amounts of memory at any given time, as only a small portion of data will be stored in memory. What do you think? – Sergey Aug 07 '14 at 14:21
  • @Sergey I thought you had fixed this issue by using the Group format? What you are talking about appears to be "streaming". – usr Aug 07 '14 at 14:27
  • 1
    Hm, you are right. It is streaming, thinking about it, there's no way 6gb would have held in memory. So, it was all to do with the way Protobuf-net was storing my collection, before Group format it had to read the entire collection in order to get the length of it, for some reason, now it is only placing markers at the start and end. It must have been streaming from the start, but because it had to read this large collection in its entirety, my program was crashing during this step. I see, I assumed it was not streaming because of that behaviour, so yeah, it is streaming how I wanted! – Sergey Aug 07 '14 at 14:39

0 Answers0