First, let me state that my knowledge is pre NET 4.0 so this information may be outdated because I know they were going to make improvements in this area.
Do not use File.ReadAllBytes to read large files (larger than 85kb), specially when you are doing it to many files sequentially. I repeat, do not.
Use something like a stream and BinaryReader.Read instead to buffer your reading. Even if this may sound not efficient since you won't blast the CPU through a single buffer, if you do it with ReadAllBytes it simply won't work as you discovered.
The reason for that is because ReadAllBytes reads the whole thing inside a byte array. If that byte array is >85Kb in mem (there are other considerations like # of array elements) it is going into the Large Object Heap, which is fine, BUT, LOH doesn't move memory around, nor defragments the released space, so, simplifying, this can happen:
- Read 1GB file, you have a 1GB chunk in the LOH, save the file. (No GC cycle)
- Read 1.5GB file, you request a 1.5GB chunk of memory, it goes into the end of the LOH, but say you get a GC cycle so the 1GB chunk you previously used gets cleared, but now you have a chunk of 2.5GB memory, the first 1GB free.
- Read a 1.6GB file, the 1GB free block at the beginning doesn't work, so the allocator goes to the end. Now you have a 4.1GB chunk of memory.
- Repeat.
You are running out of memory but you surely aren't actually using it all, fragmentation is probably killing you. Also you can actually hit a real OOM situation if the file is very large (I think the process space in Windows 32 bit is 2GB?).
If files aren't ordered or dependent on each other maybe a few threads reading them by buffering with a BinaryReader would get the job done.
References:
http://www.red-gate.com/products/dotnet-development/ants-memory-profiler/learning-memory-management/memory-management-fundamentals
https://www.simple-talk.com/dotnet/.net-framework/the-dangers-of-the-large-object-heap/