0

I'm currently considering an n x n matrix M of 64-bit integer elements stored in main memory in row-major order. I have an L1 data cache of 16KB split in 64B blocks (no L2 or L3). My code is meant to print out each element of the array one at a time, by either traversing the matrix in row-first order or column-first order.

In the case where n = 16 (i.e. 16 x 16 matrix), I've counted 0 cache misses using both row-first order and column-first order since the matrix M fits entirely in the 16KB cache (it never needs to jump to main memory to fetch an element). How would I deal with the case of, say, n = 256 (256 x 256 matrix of 64-bit ints); i.e. when M doesn't fully fit in the cache? Do I count all the ints that don't fit as misses, or can spatial locality be leveraged somehow? Assume the cache is initially empty.

chris_lee
  • 27
  • 8
  • I don't see the "file" part from the title? "Cache" is a general concept - your OS typically has an in-memory cache for files on disk, but that's not a 16 kB cache split in 64B blocks. – MSalters Oct 09 '20 at 14:30
  • @MSalters apologies - I've changed "file" so as not to be misleading (hopefully it's better now?). Furthermore, we should assume the cache is initially empty - would this change the answer for n = 16 (i.e. lead to non-zero misses)? Assuming an empty catch would mean I'd have a miss for each block that I bring into the cache, wouldn't that be 32 misses? – chris_lee Oct 09 '20 at 14:55
  • If the cache is initially empty, then the first access must be a cache miss. Therefore the answer cannot be zero. How much more, you'll have to figure out. Hint: don't count just cache misses, but also cache evictions. – MSalters Oct 09 '20 at 14:59
  • @MSalters thanks so much! Think I got it sorted :) – chris_lee Oct 10 '20 at 00:31

1 Answers1

0

The "0 cache misses" seems to assume you start out with M already in cache. That's already a bit suspicious, but OK.

For the 256x256 case, you need to simulate how the cache behaves. You must have cache misses to bring in the missing entries. Each cache miss brings in not just the requested int, but also 7 adjacent ints.

MSalters
  • 173,980
  • 10
  • 155
  • 350