File based look-up table

Question

You need an array of 10^10 4-byte integers to be used as a look-up table. Loading it to RAM would take 40GB, which isn't feasible. You never need to write to this array after it has been initialized. You need to read individual integer values from random locations of this array concurrently from multiple threads of a single process. You're guaranteed to be on a 64-bit platform. What is the fastest implementation of this look-up table? Using regular file reading functions or e.g. Boost memory-mapped file?

For random access I'd guess regular stream IO. Memory-mapped doesn't help much if there's no access pattern and it doesn't fit (mostly) in RAM (to my knowledge). — Mooing Duck, Feb 13 '12 at 21:20
@Jim Fell: The array is used to map the indexing value x to f(x) where f is a very slow function, so it can't be used at run-time. — zeroes00, Feb 13 '12 at 21:29
It is depends on access heuristic, does access suppose to be absolutely random or not. — Dmitriy Kachko, Feb 13 '12 at 21:30
@zeroes00: I see. So, if the array is a composite of variations on f(x), I suggest breaking the array into smaller arrays. For example, you can break it up by which variable is being altered or by range of the variation, etc. That way your algorithm can deterministicly load only those parts of the array that are needed resulting in faster response times. Or, better yet, treat the file as a database, and only read out the values you need, instead of trying to read out the entire file. — Jim Fell, Feb 13 '12 at 21:33
You have two obvious approaches (map the entire file vs lots of seeking, maybe breaking down the seek code into several cases with different buffer sizes set on the file stream). I don't see much point in speculating until you've done some kind of testing -- what it basically comes down to is whether reading the file through the memory mapping causes the OS to do more wasted caching than does reading it through a stream. — Steve Jessop, Feb 13 '12 at 21:41
@DmitryKachko: The accesses should be pretty much at random locations. I don't see any reason why the next index would be inclined to be close to the previous one. — zeroes00, Feb 13 '12 at 21:50

score 1 · Accepted Answer · answered Feb 13 '12 at 21:34

It sounds like you should do explicit reads.

Memory mapping gets its speed from bringing in large chunks of pages in at a time (I believe Windows does 256KiB, not sure about other platforms) and allowing you to re-access them without any penalty after the first time.

If you're just reading integers from random locations, you'll be reading in 256KB for just 4 bytes out of one page, and maybe never even re-access it. Such a waste! Also consider that you've also just paged out a lot of maybe useful data from other apps and the filesystem cache.

score 1 · Answer 2 · answered Feb 13 '12 at 21:35

Since once the file is created, you only ever need to access it in a read-only way, I wouldn't think you'd want the expense of a memory-mapped file, Boost or otherwise. That would be more useful if you had multiple processes that wanted to concurrently access the same data. In you case, you've just got read-only threads, so a simple 40g file should be the simplest and fastest.

File based look-up table

2 Answers2