1

I need to store quite large amount of binary data in database (MS SQL), and interact with this database via EF. Unfortunately, EF doesn't support FILESTREAM (more exactly, there's no streaming support).

So, I've decided to store data in chunks. Chunk is just an entity type:

public class Data
{
  public int Id { get; set; }
  public Collection<Chunk> Chunks { get; set; }
}

public class Chunk
{
  public int Id { get; set; }
  public int DataId { get; set; }
  public byte[] Content { get; set; }
}

First, I wanted to limit chunk size by some optimal value, let's say, 1 Mb.
But then I've remembered about large objects and LOH.

As far as I understand, every Chunk.Content instance will be considered as large object with following consequences (memory fragmentation, particularly). Hence, intensive creation of Chunk objects finally will cause OutOfMemoryException (it is a "24/7" application, and work with that binary data is the main purpose of application).

I can't reuse any buffer for data chunks, because I'm using EF...
Should I decrease chunk size, making it lower, then 85K?
Or is it a paranoia? :)

Dennis
  • 37,026
  • 10
  • 82
  • 150
  • do u have to store stream to database? i had many problems with it, such as performance, compression etc. – DarthVader Sep 04 '12 at 06:14
  • @DarthVader: I can't store data in FILESTREAM because of EF limitations, if you've asked me about that. – Dennis Sep 04 '12 at 06:21
  • no i m saying store the file in NAS or local, then store the location to database. – DarthVader Sep 04 '12 at 06:23
  • @DarthVader: this is an alternative I'm considering too... But, unfortunately, main disadvantage of this approach is possible mistiming of two stores, especially, when you need to backup/restore database. This is significant for me. – Dennis Sep 04 '12 at 06:30

1 Answers1

1

LOH fragmentation issue can occur only if you have too many chunks in the memory in the same time but from your description it looks like you will have one chunk per "process". How many concurrent processes do you expect to have? If you expect to have many processes you most probably also expect HW with enough processing power and memory (64bit platform). Isn't the HW a real limitation to have so many parallel chunks processed?

Anyway if you need to store stream to database you should simply use FILESTREAM without EF. Staying with EF at all costs is architecture failure.

Ladislav Mrnka
  • 360,892
  • 59
  • 660
  • 670
  • Ladislav, working with data will be part or WCF service, so I need to handle concurrent chunk processing. I'm confused with `NextObjPtr`. As far, as I understand, allocating memory from heap causes `NextObjPtr` to be increased, and compacting of the heap descreases `NextObjPtr`. Because LOH never be compacted, `NextObjPtr` for LOH will never be decreased, and continuous allocating of large objects will cause OME in perspective. A I wrong? – Dennis Sep 04 '12 at 07:47
  • Found here: http://blogs.msdn.com/b/maoni/archive/2006/04/18/large-object-heap.aspx: "Free blocks between live large objects are threaded into a free list which will be used to satisfy large object allocation requests". So, heap space will be effectively reused. – Dennis Sep 04 '12 at 08:03
  • Memory consumption does not grow infinitely. If the block is released it can be reused. OME is caused by fragmentation where there is no more memory to allocate and there is no free block in LOH with a requested size (but the total free memory in LOH is greater than requested size). – Ladislav Mrnka Sep 04 '12 at 08:21