Questions tagged [numa]

NUMA stands for Non Uniform Memory Access. It is a general linux term indicating that the hardware has multiple memory nodes, and that not all processing units have equal access to all memory.

NUMA stands for Non Uniform Memory Access. It is a general linux term indicating that the hardware has multiple memory nodes, and that not all processing units have equal access to all memory.

As processors become faster and faster, proximity to memory increases in importance for overall computing performance. NUMA systems address this problem by building closer connections between specific computing resources and memory.

307 questions
5
votes
1 answer

Bind tmpfs or ramfs to a specific memory node

I'm working on a NUMA server which has two memory nodes. I want to create a file system which will be loaded in main memory like tmpfs or ramfs and I want to bind it to a specific memory node. In other words I don't want the ramfs contents to be…
user3761809
  • 111
  • 6
5
votes
2 answers

Get node distance (hops) in NUMA systems

Is there any API/way to get the "distance" (called 'hops' in literature) between two NUMA nodes? I want to implement a memory allocation system that takes advantage of this (reuse memory from the nearest node, because the access is faster). Windows…
Gratian Lup
  • 1,485
  • 3
  • 19
  • 29
5
votes
2 answers

Are there limits on allocating small chunks using numa_alloc_onnode()?

I am experimenting with NUMA on a machine that has 4 Operton 6272 processors, running centOS. There are 8 NUMA nodes, each with 16GB memory. Here is a small test program I'm running. void pin_to_core(size_t core) { cpu_set_t cpuset; …
Alexander Chertov
  • 2,070
  • 13
  • 16
5
votes
1 answer

mbind returns EINVAL

I am using the code provided for the following question numa+mbind+segfault, every call to mbind returns EINVAL. How can I get what is exactly wrong? I am asking this because EINVAL can be returned for many reasons. page_size =…
tiki
  • 419
  • 1
  • 6
  • 16
5
votes
5 answers

starting mongodb via numactl as daemon

I'm trying to get mongodb started on a NUMA machine as a daemon. When I run numactl --interleave=all mongod & Mongo starts and runs correctly, but all the output still shows up. (e.g., Fri Jun 22 12:10:29 [initandlisten] connection accepted from…
Libby
  • 581
  • 2
  • 6
  • 21
5
votes
1 answer

Should I worry about NUMA in one CPU system?

Is there any implication for a Windows developer for NUMA supported CPU architecture if only one CPU is present?
Boppity Bop
  • 9,613
  • 13
  • 72
  • 151
4
votes
1 answer

Why do allocations with numa_alloc_onnode() lead to "The page is not present"?

When I allocate memory on a specific NUMA node using numa_alloc_onnode() like this : char *ptr; if ((ptr = (char *) numa_alloc_onnode(1024,1)) == NULL) { fprintf(stderr,"Problem in %s line %d allocating memory\n",__FILE__,__LINE__); …
Rob_before_edits
  • 1,163
  • 9
  • 13
4
votes
1 answer

How to make program NUMA ready?

My program uses shared memory as a data storage. This data must be available to any application running, and fetching this data must be fast. But some applications can run on different NUMA nodes, and data access for them is realy expensive. Is data…
Evgeny Lazin
  • 9,193
  • 6
  • 47
  • 83
4
votes
3 answers

realloc() for NUMA Systems using HWLOC

I have a several custom allocators that provide different means to allocate memory based on different policies. One of them allocates memory on a defined NUMA node. The interface to the allocator is straight-forward template class…
grundprinzip
  • 2,471
  • 1
  • 20
  • 34
4
votes
0 answers

How data are distributed in memory pages in ccNUMA systems?

I'm trying to understand shared-memory architectures, especially ccNUMA systems. I have read about first touch policy, but I am still a bit confused. I am trying to understand how data are distributed in memory pages. Let's say we have the example…
4
votes
3 answers

Benchmarking processor affinity impact

I'm working on a NUMA architecture, where each compute node has 2 sockets and 4 cores by socket, for a total of 8 cores by compute node, and 24GB of RAM by node. I have to proof that setting processor affinity can have a significant impact on…
Charles Brunet
  • 21,797
  • 24
  • 83
  • 124
4
votes
1 answer

Is mov + mfence safe on NUMA?

I see that g++ generates a simple mov for x.load() and mov+mfence for x.store(y). Consider this classic example: #include #include std::atomic x,y; bool r1; bool r2; void go1(){ x.store(true); } void go2(){ …
qbolec
  • 5,374
  • 2
  • 35
  • 44
4
votes
2 answers

Is the memory block allocated by a thread is on the same affinity as the thread itself until the thread exit?

This is a question about NUMA. For example, in the code below, is the buffer allocated at the local memory of the thread/process throughout its life? for (int th = 0; th < maxThreads; th++) { threads[th] = std::thread([&, th] { …
4
votes
1 answer

Determine NUMA layout via latency/performance measurements

Recently I have been observing performance effects in memory-intensive workloads I was unable to explain. Trying to get to the bottom of this I started running several microbenchmarks in order to determine common performance parameters like cache…
4
votes
3 answers

How to force two process to run on the same CPU?

Context: I'm programming a software system that consists of multiple processes. It is programmed in C++ under Linux. and they communicate among them using Linux shared memory. Usually, in software development, is in the final stage when the…
kovan
  • 680
  • 1
  • 5
  • 15