2

Consider the following matrix transpose routine:

typedef int array[4][4];

void transpose(array dst, array src){
    int i,j;

    for(i=0;i<4;i++){
        for (j=0;j<4;j++){
            dst[j][i] = src[i][j];
        }
    }
}

Assume this code runs on a machine with the following properties

  • sizeof(int) = 4
  • The src array starts at the address of 0 and the dst array starts at the address 64
  • There is a single L! data cache that is direct-mapped, write-through, writeallocate, with a block size of 16 bytes
  • The cache has a total size of 32 data bytes and the cache is initially empty.
  • Access to the src and dst array are the only sources of read and write misses.

I am tasked with determining whether each access to each array is a cache hit or miss

I got for the dst

_____________________________________
|    Col 0    Col 1    Col 2    Col 3|
|Row0    m        m        m        m|
|Row1    m        m        m        m|
|Row2    m        m        m        m|
|Row3    m        m        m        m|
______________________________________

and for the src array

_____________________________________
|    Col 0    Col 1    Col 2    Col 3|
|Row0    m        h        h        h|
|Row1    m        h        h        h|
|Row2    m        h        h        h|
|Row3    m        h        h        h|
______________________________________

Are these correct? Asking around with other students, everyone seems to have gotten different answers.

JPHamlett
  • 365
  • 3
  • 9
  • Does `dest` really have address 63, which isn't word-aligned? – MikeCAT Dec 14 '15 at 15:38
  • Whoops, mistyped. It has 64, will edit my post. – JPHamlett Dec 14 '15 at 15:39
  • And where will `i`, `j`, `dst`(the passed as the argument) and `src`(the pointer passed as the argument) be stored? What is the actual machine language to be executed? (They may be stored on registers thanks to the optimization) – MikeCAT Dec 14 '15 at 15:39
  • Ok, how would the problem be tackled with the way you mentioned? – JPHamlett Dec 14 '15 at 15:44
  • I think this part also makes it safe to make this assumption: *Access to the src and dst array are the only sources of read and write misses.* Maybe it'd help to emphasize that or put it in bold. –  Dec 14 '15 at 15:57
  • I believe what you have is correct but need an expert to jump in. Each iteration accesses an aligned row of both `dst` and `src`, each iteration will miss for `dst` since it's cycling rows there, `src` will have a compulsory miss on the first column but not for second, third, and fourth. –  Dec 14 '15 at 16:05
  • Maybe re-make the example so that both variables are declared at file scope? – Lundin Dec 14 '15 at 16:08
  • 2
    Possible duplicate of [Cache Memory Optimization Array Transpose: C](http://stackoverflow.com/questions/20549940/cache-memory-optimization-array-transpose-c) – Martin Zabel Dec 14 '15 at 18:03

0 Answers0