1

I have virtual machine which has passthrough infiniband nic. I am testing inifinband functionality using hello world program. I am new in this world so may need help to understand following error

I have install openmpi on ubuntu using apt-get command

spatel@ib-1:~$ mpirun -V
mpirun (Open MPI) 4.0.3

Infiniband nic

spatel@ib-1:~$ lspci -nn | grep -i mell
00:05.0 Infiniband controller [0207]: Mellanox Technologies MT28908 Family [ConnectX-6 Virtual Function] [15b3:101c]

My hello world program

spatel@ib-1:~$ mpirun -np 2 ./mpi_hello_world
--------------------------------------------------------------------------
WARNING: No preset parameters were found for the device that Open MPI
detected:

  Local host:            ib-1
  Device name:           mlx5_0
  Device vendor ID:      0x02c9
  Device vendor part ID: 4124

Default device parameters will be used, which may result in lower
performance.  You can edit any of the files specified by the
btl_openib_device_param_files MCA parameter to set values for your
device.

NOTE: You can turn off this warning by setting the MCA parameter
      btl_openib_warn_no_device_params_found to 0.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
By default, for Open MPI 4.0 and later, infiniband ports on a device
are not used by default.  The intent is to use UCX for these devices.
You can override this policy by setting the btl_openib_allow_ib MCA parameter
to true.

  Local host:              ib-1
  Local adapter:           mlx5_0
  Local port:              1

--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: There was an error initializing an OpenFabrics device.

  Local host:   ib-1
  Local device: mlx5_0
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Open MPI failed an OFI Libfabric library call (fi_endpoint).  This is highly
unusual; your job may behave unpredictably (and/or abort) after this.

  Local host: ib-1
  Location: mtl_ofi_component.c:629
  Error: Unspecified error (256)
--------------------------------------------------------------------------
Hello world from processor ib-1, rank 0 out of 2 processors
Hello world from processor ib-1, rank 1 out of 2 processors
[ib-1:65704] 1 more process has sent help message help-mpi-btl-openib.txt / no device params found
[ib-1:65704] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
[ib-1:65704] 1 more process has sent help message help-mpi-btl-openib.txt / ib port not selected
[ib-1:65704] 1 more process has sent help message help-mpi-btl-openib.txt / error in device init
[ib-1:65704] 1 more process has sent help message help-mtl-ofi.txt / OFI call fail

It throws bunch of warning and error so not sure what i should understand, does it use ib interface to run this job?

UPDATE

After suggested by @Gilles Gouaillardet in comment i have compiled ompi with ucx and now i am seeing following output during hello_world prog

spatel@ib-1:~$ /home/spatel/ompi/bin/mpirun -np 2 ./hello_world_ucx --mca opal_common_ucx_opal_mem_hooks 1
--------------------------------------------------------------------------
PMIx was unable to find a usable compression library
on the system. We will therefore be unable to compress
large data streams. This may result in longer-than-normal
startup times and larger memory footprints. We will
continue, but strongly recommend installing zlib or
a comparable compression library for better user experience.

You can suppress this warning by adding "pcompress_base_silence_warning=1"
to your PMIx MCA default parameter file, or by adding
"PMIX_MCA_pcompress_base_silence_warning=1" to your environment.
--------------------------------------------------------------------------

Hello world from processor ib-1, rank 0 out of 2 processors
Hello world from processor ib-1, rank 1 out of 2 processors

Now to test my infiniband network i created similar another vm ib-2 with inifinband nic to see hello_world using RDMA for communication.

/home/spatel/ompi/bin/mpirun --host ib-1,ib-2 -np 2 ./hello_world_ucx --mca opal_common_ucx_opal_mem_hooks 1

Same time i run tcpdump on ibs5 interface which is my Infiniband nic but i see no activity and notice MPI messages still using traditional nic eth0 for communication. how do i make sure it use only infiniband for MPI (i don't have any IP configure on ib nic)

Satish
  • 16,544
  • 29
  • 93
  • 149
  • This card is not known by Open MPI 4.1.0 `btl/openib` component. As a workaround, try editing your /.../share/openmpi/mca-btl-openib-device-params.ini`file. At section [Mellanox ConnectX6], append `,4124` to the vendor_part_id =` line. I cannot test it, so I hope that will work for you. – Gilles Gouaillardet Feb 05 '22 at 01:50
  • FWIW, `btl/openib` is a legacy component, and you should really use `UCX`. The logs indicate that Open MPI fails to use Infiniband via `btl/openib` and `mtl/ofi` (aka `libfabric`). Though that could work, this is not the best way ... So I recommend you install the latest UCX version and rebuild Open MPI. – Gilles Gouaillardet Feb 05 '22 at 01:52
  • As you suggested to use UCX so i have compile ompi with ucx and updated my question. I am seeing my mpi job default using traditional nic for messages communication instead of `IB` nice. do you know what is the reason? – Satish Feb 05 '22 at 05:16
  • that can happen if `ucx` is discarded. Try to `mpirun --mca pml ucx ...` to force `pml/ucx` to be used, or `mpirun` to abort if it cannot be used. – Gilles Gouaillardet Feb 05 '22 at 05:50
  • I have tried `mpirun --host ib-1,ib-2 -np 2 ./hello_world_ucx --mca pml ucx` but its still using traditional interface to run job. i wonder my ib-1 and ib-2 host name IP address are configured on `eth0` interface may be that is why its using those interface? I don't have any ip address on Mellanox interface which is `ibs5` do you think i should configure IP address on `ib` interface and map those ip with ib-1 and ib-2 host? – Satish Feb 05 '22 at 06:04
  • you used the wrong command line. try `mpirun --mca pml ucx --host ib-1,ib-2 -np 2 ./hello_world_ucx` – Gilles Gouaillardet Feb 05 '22 at 06:30
  • FWIW, Open MPI does not use hostname to IP resolution in order to select the network to be used for communications. – Gilles Gouaillardet Feb 05 '22 at 06:30
  • `mpirun --mca pml ucx --host ib-1,ib-2 -np 2 ./hello_world_ucx` still sending message using `eth0` interface. is there any debug or log i can see why its not selecting `ib` but default using `eth0`? feel like my mpi program doesn't know about ib. – Satish Feb 05 '22 at 13:58
  • try `mpirun --mca pml ucx --mca pml_base_verbose 20 ...` and see what the log say. a possible explanation is ucx does not know your IB card (not yet supported?) and uses the `tcp` provider. – Gilles Gouaillardet Feb 05 '22 at 14:06
  • Unless you are using IP over IB (and this is not what you should be doing) I do not expect anything to be reported by `tcpdump`. How did you conclude the traffic is using TCP instead of IB? (performance? other tool?) – Gilles Gouaillardet Feb 05 '22 at 14:08
  • I am running tcpdump on eth0 and ibs5 interface and all i am seeing when i run all messages going over eth0 and ibs5 didn't hit any packet. you are saying it shouldn't report any thing on ib interface then how do i confirm that my hello_world program using infiniband ? any other method to confirm? – Satish Feb 05 '22 at 14:15
  • you can run a standard pingpong benchmark from IMB or OSU test suite and look at the bandwidth for large messages. If it exceeds the theoretical performance of your TCP network, that is a hint Infiniband is being used. – Gilles Gouaillardet Feb 06 '22 at 01:57
  • @GillesGouaillardet thank for pingpong hint, i am able to ping/pong two node ib interface also i did bw report test and it hit 100Gbps so look like connectivity is good. Now how do i verify my basic hello world or any other MPI program using Infiniband. just wanted to make sure application using IB for communication. I also trying to run `ibdump` application to capture packet to check activity on interface. – Satish Feb 07 '22 at 18:44
  • Hi, Have you solved this? I have a similar problem, in my case I want to launch a job with slurm running in a VM without IB interface, to other nodes with IB network interfaces ( https://serverfault.com/q/1103439/970600 ) – SimoneM May 03 '23 at 11:14

0 Answers0