I'm trying to understand shared-memory architectures, especially ccNUMA systems. I have read about first touch policy, but I am still a bit confused. I am trying to understand how data are distributed in memory pages. Let's say we have the example below. Regarding first touch policy, is it true that the processor performing the first write will take the page and this page will contain all array elements from A[0]
to A[199]
included? Is it still true even if the number of bytes is less than the page size? Will this be a whole page (page number 0 for example)? I assume that there are 5 threads.
#pragma omp parallel for
for(int i=0 ; i<1000 ; i++){
A[i] = i; // dummy values
}