Is running with different cores different from running with different thread in OpenMP?

Question

To execute a section of a code in parallel using a known number of thread, we usually do this:

#pragma omp parallel num_threads(8)
{}

However, how can we set number of cores instead of thread? Are these different?

score 2 · Accepted Answer · answered Mar 15 '21 at 00:49

TL;DR: you cannot directly specify a number of cores in OpenMP preprocessing directives, but you can control how OpenMP threads are mapped on the available cores.

How it works:

Software threads can be dynamically created and destroyed at runtime by applications. They are mapped on hardware resources like hardware threads and cores that are fixed at runtime for a given platform. You cannot control cores directly (in user-space), only threads.

In OpenMP you can control the number of threads at runtime using several approaches:

the num_threads clause in preprocessing directives
the OMP_NUM_THREADS environment variable
the omp_set_num_threads runtime function

OpenMP abstracts the hardware hierarchy using places. It defines a place as "an unordered set of implementation-defined hardware unit of a device on which one or more OpenMP threads can execute". In practice, places are usually a set of hardware threads on CPUs. Examples of valid place includes a given hardware socket, three specific cores or one specific hardware thread (multiple places can share the same hardware execution units). Places can be manually set using the OMP_PLACES environment variable.

The mapping/binding of the OpenMP threads to places can be controlled using the environment variable OMP_PROC_BIND, or more recently using the clause proc_bind within preprocessing parallel directives. For example, you can force OpenMP threads to be bound to places, or to be uniformly spread among them.

Example:

If you want to use 4 cores, you can use the following environment:

OMP_PLACES="cores(4)"
OMP_PROC_BIND=close

The OpenMP runtime will arbitrarily select 4 cores of your hardware and execute the threads on it so that the first thread will run on the first core, the second thread on the second core, etc. If there are 8 threads, then each of the 4 core will execute two OpenMP threads (even if you have a processor with 8 cores).

so in order to know that how many threads are executing in each core, first I should know the total number of threads in my machine and then divide it by the number of forced cores. right? — MA19, Mar 15 '21 at 05:46
I assume you are talking about *software* threads (that should not be confused with *hardware* threads). If so, for simple cases (this is dependent of your configuration/application), you can divide the total number of *OpenMP* threads of a *target application* by the number of "forced" cores. — Jérôme Richard, Mar 15 '21 at 20:17

Is running with different cores different from running with different thread in OpenMP?

1 Answers1