I just installed Linux and Intel MPI to two machines:
(1) Quite old (~8 years old) SuperMicro server, which has 24 cores (Intel Xeon X7542 X 4). 32 GB memory. OS: CentOS 7.5
(2) New HP ProLiant DL380 server, which has 32 cores (Intel Xeon Gold 6130 X 2). 64 GB memory. OS: OpenSUSE Leap 15
After installing OS and Intel MPI, I compiled intel MPI benchmark and ran it:
$ mpirun -np 4 ./IMB-EXT
It is quite surprising that I find the same error when running IMB-EXT and IMB-RMA, though I have a different OS and everything (even GCC version used to compile Intel MPI benchmark is different -- in CentOS, I used GCC 6.5.0, and in OpenSUSE, I used GCC 7.3.1).
On the CentOS machine, I get:
#---------------------------------------------------
# Benchmarking Unidir_Put
# #processes = 2
# ( 2 additional processes waiting in MPI_Barrier)
#---------------------------------------------------
#
# MODE: AGGREGATE
#
#bytes #repetitions t[usec] Mbytes/sec
0 1000 0.05 0.00
4 1000 30.56 0.13
8 1000 31.53 0.25
16 1000 30.99 0.52
32 1000 30.93 1.03
64 1000 30.30 2.11
128 1000 30.31 4.22
and on the OpenSUSE machine, I get
#---------------------------------------------------
# Benchmarking Unidir_Put
# #processes = 2
# ( 2 additional processes waiting in MPI_Barrier)
#---------------------------------------------------
#
# MODE: AGGREGATE
#
#bytes #repetitions t[usec] Mbytes/sec
0 1000 0.04 0.00
4 1000 14.40 0.28
8 1000 14.04 0.57
16 1000 14.10 1.13
32 1000 13.96 2.29
64 1000 13.98 4.58
128 1000 14.08 9.09
When I don't use mpirun (which means there is only one process to run IMB-EXT), the benchmark runs through, but Unidir_Put needs >=2 processes, so doesn't help so much, and I also find that the functions with MPI_Put and MPI_Get is extremely slower than I expected (from my experience). Also, using MVAPICH on the OpenSUSE machine did not help. The output is:
#---------------------------------------------------
# Benchmarking Unidir_Put
# #processes = 2
# ( 6 additional processes waiting in MPI_Barrier)
#---------------------------------------------------
#
# MODE: AGGREGATE
#
#bytes #repetitions t[usec] Mbytes/sec
0 1000 0.03 0.00
4 1000 17.37 0.23
8 1000 17.08 0.47
16 1000 17.23 0.93
32 1000 17.56 1.82
64 1000 17.06 3.75
128 1000 17.20 7.44
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 49213 RUNNING AT iron-0-1
= EXIT CODE: 139
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
update: I tested OpenMPI, and it goes through smoothly (although my application does not recommend using openmpi, and I still don't understand why Intel MPI or MVAPICH doesn't work...)
#---------------------------------------------------
# Benchmarking Unidir_Put
# #processes = 2
# ( 2 additional processes waiting in MPI_Barrier)
#---------------------------------------------------
#
# MODE: AGGREGATE
#
#bytes #repetitions t[usec] Mbytes/sec
0 1000 0.06 0.00
4 1000 0.23 17.44
8 1000 0.22 35.82
16 1000 0.22 72.36
32 1000 0.22 144.98
64 1000 0.22 285.76
128 1000 0.30 430.29
256 1000 0.39 650.78
512 1000 0.51 1008.31
1024 1000 0.84 1214.42
2048 1000 1.86 1100.29
4096 1000 7.31 560.59
8192 1000 15.24 537.67
16384 1000 15.39 1064.82
32768 1000 15.70 2086.51
65536 640 12.31 5324.63
131072 320 10.24 12795.03
262144 160 12.49 20993.49
524288 80 30.21 17356.93
1048576 40 81.20 12913.67
2097152 20 199.20 10527.72
4194304 10 394.02 10644.77
Is there any chance that I am missing something in installing MPI, or installing OS in these servers? Actually, I assume that OS is the problem, but not sure where to start...
Thanks a lot in advance,
Jae