1

I want to load a large set of MAT-files in a loop. I'm testing different ways to make the files load faster, and I have a subset of 10,000 files I'm working with, each containing about 50 variables of different sizes. I noticed an interesting detail:

  1. If I load 10,000 files using load(filename) in a loop one after another, it takes about 5 minutes.
  2. If I load the same set of files a few more times (basically repeat the test), the time doesn't change.
  3. If I load only one variable from each file using load(filename, 'varname'), it takes about the same amount of time.
  4. If I repeat step 3, it takes about 15 seconds to complete the load. Same files, same variable being loaded.
  5. If I now run step 1 and again repeat step 3, I'm back to the load taking about 5 minutes. But once I try to do a second load, it takes a very short time again.

I'm puzzled. Is Matlab somehow keeping the data in memory once it loads it from a file once? This phenomenon, however, survives Matlab restarts and clear commands, so can it actually be Windows 7 that's keeping a memory cache of some of the data?

Needless to say, I would like to determine what's causing the unexpected improvement and, if possible, reproduce it to make the first load as fast as the subsequent ones.

dima1109
  • 193
  • 1
  • 1
  • 7
  • You are probably seeing a combination of Windows file-caching behaviour and Matlab memory-management behaviour. If you were writing your own code in say java, I'd say run all 10000 files into one file. And read it in in one go. – hack_on Aug 16 '13 at 22:46
  • I doubt it's the Matlab memory management because the effect persists over a Matlab restart. I didn't think Windows was able to cache only parts of files (total size is around 10 Gb, too big to fit in memory at once). – dima1109 Aug 16 '13 at 22:57
  • 15 seconds is probably not enough time to open 10000 files and read 10Gig from the disk so perhaps the result in 4 is because Matlab can cache the 1 item from each file, but cannot cache 50 items from each file. – hack_on Aug 16 '13 at 23:03

0 Answers0