0

I have been trying to get better awareness of cache locality. I produced the 2 code snippets to gain better understanding of the cache locality characteristics of both.

vector<int> v1(1000, some random numbers);
vector<int> v2(1000, some random numbers);
for(int i = 0; i < v1.size(); ++i)
{
  auto min = INT_MAX;
  for(int j = 0; j < v2.size(); ++j)
  {
    if(v2[j] < min)
    {
       v1[i] = j;
    }
  }
}

vs

vector<int> v1(1000, some random numbers);
vector<int> v2(1000, some random numbers);
for(int i = 0; i < v1.size(); ++i)
{
  auto min = INT_MAX;
  auto loc = -1;
  for(int j = 0; j < v2.size(); ++j)
  {
    if(v2[j] < min)
    {
       loc = j;
    }
  }
  v1[i] = loc;
}

In the first code v1 is being updated directly inside the if statement. Does this cause cache locality issues because during the update it'll replace the current cache line with some contiguous segment of data from v1[i] to v1[cache_line_size/double_size]? If this analysis is correct, then this would seem to slow down the loop over j, since for each iteration of the j loop, it'll likely have cache misses. It seems the second implementation alleviates this issue by using a local variable and not updating v1[i] until after the j loop is complete?

In practice, I think the compiler might optimize the cache locality issue away? For discussion, how about we assume no compiler optimizations.

roulette01
  • 1,984
  • 2
  • 13
  • 26
  • Note: The compiler is likely to notice that the inner loop will produce the exact same result every time, hoist it, run it once and remember the result. Then all it needs to do is initialize `v1`with N elements of the result. – user4581301 Sep 07 '20 at 17:08
  • The cache stores more than one line – user253751 Sep 07 '20 at 17:08
  • @user4581301 Yikes, I didn't mean to make this example so simple. I mean for it to be more complicated than that, but didn't want to overcomplicate it to divert attention away from the real question. – roulette01 Sep 07 '20 at 17:10
  • @user253751 How is the number of cache lines determined? – roulette01 Sep 07 '20 at 17:10
  • 1
    It's part of the design of the processor. – user253751 Sep 07 '20 at 17:14

0 Answers0