0

Is it possible to use all available threads for OMP on a region where only master MPI is active?. I mean something like this:

         START 
           |
 MPI-I --------- MPI-II
(master)
   |               |
omp1-omp2       -skip-         REGION1 
omp3-omp4       -skip-
   |               |
omp1-omp2      omp3-omp4       REGION2
   |               |
   -----------------
          | 
         END 

Where nr. total threads available*** is 4 (2 are used for mpi, 4 used by MPI1 -master- on Region1, and 2 used by each MPI on Region2).

It looks like this is only working on Windows*, not on Linux**. On Linux, omp_get_num_procs() seems to detect that threads are being used by other mpi procs. and retrieves a different number -lower- than that Windows does (which gives all available*** ones, irrespective whether they are currently occupied by other mpi active procs.)

On Linux even using explicitly the following clause !$OMP &num_threads(Max_OMP_usage), with Max_OMP_usage equal to the total number of threads available***, within the OMP DO construct has no effect.

*Windows: Intel ifort, MSMPI

**Linux: Intel ifort, MPI OneApi2021.

***By total nr. of threads available I mean the ones "lscpu" retrieves for example, this is, the ones present physically (and not the one omp_get_num_procs() may retrieve).

Costagol
  • 1
  • 1
  • The MPI implementation typically assign non overlapping core sets to each MPI task. So if at some point in time a task requires more, you would have to direct your MPI implementation **not** to do any process binding. – Gilles Gouaillardet Jul 19 '22 at 10:53
  • Yes, using I_MPI_PIN=0 it works! Thanks a lot!. – Costagol Jul 19 '22 at 12:31
  • You do realize that you're using the fact that your MPI processes are on the same processor chip or at least the same node? That is not necessarily the case: MPI was designed for workstation networks and clusters where the MPI processes could only communicate through a network cable. – Victor Eijkhout Jul 20 '22 at 14:38
  • Yes, yes Master mpi only executes Region1 in omp parallel. Any other mpi process -other than the master- irrespective to the node where it is located (same node -chip- than master mpi or not) will simply skip it. Yes, I do realize that those omp threads (omp lives inside a single node) are occupied by mpi processes but at that point they are not making any work as they skip that part and meet master mpi only at Region2 – Costagol Jul 21 '22 at 18:49

1 Answers1

0

@Gilles Gouaillardet & @Costagol

Yes, you are correct. We need to set I_MPI_PIN=0 in order to stop process binding & thus only the master can use all the OMP threads.

You can also find the links below for the Interoperability with OpenMP and Environment Variables for Process Pinning:

https://www.intel.com/content/www/us/en/develop/documentation/mpi-developer-reference-linux/top/environment-variable-reference/process-pinning/interoperability-with-openmp-api.html

https://www.intel.com/content/www/us/en/develop/documentation/mpi-developer-reference-linux/top/environment-variable-reference/process-pinning/environment-variables-for-process-pinning.html