I writing NUMA-aaware cache for large objects (matrices of doubles) for 4 socket server. I observe that intersocket communication is the bottleneck for my application. Hence, I want threads on different sockets to have separate matrix caches. I have bounded threads to specific physical processors and now I need to make threads select correct cache.
Suppose cache is defined in the following way:
matrix_cache_t *cache[SOCKETS_LIMIT];
I need each thread to know its socket id and select correct cache, e.g. cache[0]
, cache[1]
, cache[2]
or cache[3]
.
I am writing the application in C using OpenMP and it is supposed to run on both Windows and Linux.