0

How is it possible to pass the IntelMPI flag -print-rank-map as input to the srun command or as an environment variable into the batch script which is submitted in a SLURM system via the sbatch command?

intergallactic
  • 138
  • 1
  • 11
  • 1
    According to the Intel MPI docs setting `export I_MPI_DEBUG=4` should print the process pinning information and node mapping table: https://www.intel.com/content/www/us/en/develop/documentation/mpi-developer-reference-linux/top/environment-variable-reference/other-environment-variables.html – AndyT Oct 05 '22 at 14:10
  • Hi @AndyT, Yes it does, but at the node level. No info is provided on the socket level i.e. which processes run on which socket. I think `-print-rank-map` does not provide it either, contrary to `--report-bindings` of the OpenMPI implementation. – intergallactic Oct 07 '22 at 09:56

1 Answers1

1

Using export I_MPI_DEBUG=4 along with a knowledge of which core IDs belong to which sockets allows you to get this information. For example, I can get the mapping between sockets and core IDs from lscpu:

[auser@login3 ~]$ lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              72
On-line CPU(s) list: 0-71
Thread(s) per core:  2
Core(s) per socket:  18
Socket(s):           2
NUMA node(s):        2
Vendor ID:           GenuineIntel
CPU family:          6
Model:               79
Model name:          Intel(R) Xeon(R) CPU E5-2695 v4 @ 2.10GHz
Stepping:            1
CPU MHz:             3091.353
CPU max MHz:         3300.0000
CPU min MHz:         1200.0000
BogoMIPS:            4199.86
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            46080K
NUMA node0 CPU(s):   0-17,36-53
NUMA node1 CPU(s):   18-35,54-71
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a rdseed adx smap intel_pt xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts

As this is an Intel Broadwell CPU, the NUMA regions correspond to sockets:

NUMA node0 CPU(s):   0-17,36-53
NUMA node1 CPU(s):   18-35,54-71

Setting export I_MPI_DEBUG=4 gives the following type of information from which I can work out that ranks 0-17 are bound to socket 0 and ranks 18-35 are bound to socket 1.

[0] MPI startup(): Intel(R) MPI Library, Version 2019 Update 9  Build 20200923 (id: abd58e492)
[0] MPI startup(): Copyright (C) 2003-2020 Intel Corporation.  All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric version: 1.10.1-impi
[0] MPI startup(): libfabric provider: verbs;ofi_rxm
[0] MPI startup(): Rank    Pid      Node name  Pin cpu
[0] MPI startup(): 0       2685376  r1i7n14    0
[0] MPI startup(): 1       2685377  r1i7n14    1
[0] MPI startup(): 2       2685378  r1i7n14    2
[0] MPI startup(): 3       2685379  r1i7n14    3
[0] MPI startup(): 4       2685380  r1i7n14    4
[0] MPI startup(): 5       2685381  r1i7n14    5
[0] MPI startup(): 6       2685382  r1i7n14    6
[0] MPI startup(): 7       2685383  r1i7n14    7
[0] MPI startup(): 8       2685384  r1i7n14    8
[0] MPI startup(): 9       2685385  r1i7n14    9
[0] MPI startup(): 10      2685386  r1i7n14    10
[0] MPI startup(): 11      2685387  r1i7n14    11
[0] MPI startup(): 12      2685388  r1i7n14    12
[0] MPI startup(): 13      2685389  r1i7n14    13
[0] MPI startup(): 14      2685390  r1i7n14    14
[0] MPI startup(): 15      2685391  r1i7n14    15
[0] MPI startup(): 16      2685392  r1i7n14    16
[0] MPI startup(): 17      2685393  r1i7n14    17
[0] MPI startup(): 18      2685394  r1i7n14    18
[0] MPI startup(): 19      2685395  r1i7n14    19
[0] MPI startup(): 20      2685396  r1i7n14    20
[0] MPI startup(): 21      2685397  r1i7n14    21
[0] MPI startup(): 22      2685398  r1i7n14    22
[0] MPI startup(): 23      2685399  r1i7n14    23
[0] MPI startup(): 24      2685400  r1i7n14    24
[0] MPI startup(): 25      2685401  r1i7n14    25
[0] MPI startup(): 26      2685402  r1i7n14    26
[0] MPI startup(): 27      2685403  r1i7n14    27
[0] MPI startup(): 28      2685404  r1i7n14    28
[0] MPI startup(): 29      2685405  r1i7n14    29
[0] MPI startup(): 30      2685406  r1i7n14    30
[0] MPI startup(): 31      2685407  r1i7n14    31
[0] MPI startup(): 32      2685408  r1i7n14    32
[0] MPI startup(): 33      2685409  r1i7n14    33
[0] MPI startup(): 34      2685410  r1i7n14    34
[0] MPI startup(): 35      2685411  r1i7n14    35
AndyT
  • 491
  • 2
  • 10
  • thanks @Andy, indeed this is true. How did you invoke the mpirun? Did you use the mpirun command itself? mpirun.hydra? Did you use srun (SLURM)? For example, indeed with the mpirun I see the core pinning but with the srun the core number is "+1" as info for every MPI rank, which I'm not very sure where it corresponds to? – intergallactic Oct 10 '22 at 11:21
  • I used `srun` from within a batch script submitted using the `sbatch` command. This sounds like a local configuration issue so you probably need to speak to your local system adminstrators/support team to go further. – AndyT Oct 11 '22 at 13:38