11

If I have a multi-processor board that has cache-coherent non-uniform memory access ( NUMA ), i.e. separate "northbridges" with separate RAM for each processor, does any compiler know how to automatically spread the data across the different memory systems such that processes working on local threads are mostly retrieving their data from the RAM associated with the processor the thread is running on?

I have a setup where 1 GB is attached to processor 0, 1 GB is attached to processor 1, et c. up to 4 processors. In the coherent memory space the physical memory for the RAM on the 1st processor is addresses 0 to 1GB-1. For the second processor it is 1GB to 2GB-1, and so on.

Will any compilers, or perhaps malloc specifically, associate new memory alloc'd by a process on a specific core to the physical RAM associated with that core?

Ross Rogers
  • 23,523
  • 27
  • 108
  • 164
  • Out of interest, who is the board manufacturer? – rama-jka toti Jan 26 '10 at 21:31
  • I posed the question like this, but my original problem has to do with a number of cores on 1 die and the cost of doing a memory access for cores at different parts of the chip for different memory regions. – Ross Rogers Jan 27 '10 at 01:32

3 Answers3

7

Linux kernel knows about NUMA and will try to give your process pages from memory local to the current CPU (source: U. Drepper, "What Every Programmer Should Know About Memory".)

Nikolai Fetissov
  • 82,306
  • 11
  • 110
  • 171
  • 5
    ..and in fact it *needs* to be done in the kernel, because in general userspace processes don't control how their linear addresses map onto physical addresses, since they don't have control of their page tables. – caf Jan 26 '10 at 23:25
5

NUMA-aware memory allocation is not done at compile time. Making assumptions like this would be bad for portability.

On Linux, this is a kernel function, though you can control this at runtime with numactl or set_mempolicy or with libnuma.

Eric Seppanen
  • 5,923
  • 30
  • 24
4

For MS platforms, the compiler is not aware of NUMA. However, the system is NUMA aware and will attempt to allocate memory in the same node.

See http://code.msdn.microsoft.com/64plusLP for some more details on how recent versions of Windows handle NUMA.

Michael
  • 54,279
  • 5
  • 125
  • 144