2

I am trying to run simple openmpi test on two servers.

 mpirun --report-bindings --host serv1.cell,serv2.cell  -np 2 hostname

Both servers runs OpenSuse 13.2 and have similar network inteface configuration:

ens2f0 - internet connection, External firewall zone

ens2f1 - lan connection (192.168.0.0), Internal firewall zone

ens2f2 - bonding slave, Internal firewall zone

ens2f3 - bonding slave, Internal firewall zone

bond0 - bonding inteface (192.168.6.0), different subnet than ensf1, Internal firewall zone

serv1.cell and serv2.cell are defined in /etc/hosts as adresses in the bonding network (192.168.6.0)

Openmpi was installed from default repos using zypper.

If both firewall are off - everything is fine, but when one of them is running, strange things happens.

If I turn off firewall on serv1, and runs it on serv2, openmpi works on serv1:

serv1.cell:~ # mpirun --report-bindings --host serv1.cell,serv2.cell  -np 2 hostname
serv2.cell
serv1.cell

And does not work on serv2:

serv2.cell:~ # mpirun --report-bindings --host serv1.cell,serv2.cell  -np 2 hostname

If I turn off firewall on serv2, and run it on on serv1 it goes the other way around: serv2 works fine, but serv1 stucks.

I also tried a simple test using netcat: both firewall are on, netcat listen on serv1, connection and data from serv2 is ok, and vice versa, so the firewalls allows anything though bond0. It is not a solution to turn firewalls off, so how I should configure OpenMPI (or firewall) to make both servers work properly?

  • Found that on the slave server mpi orted daemon runs with the -mca orte_hnp_uri 842792960.0;tcp://wan.address:50735;tcp://192.168.0.206:50735;tcp://192.168.6.206:50735 option. I think I need to override defaults, somehow. – Дмитрий Пузырьков Jan 29 '16 at 08:18

1 Answers1

1

Finally found how to tell OpenMPI to use only specified interfaces. In the /path/to/openmpi/etc/openmpi-mca-params.conf you should describe the networks and interfaces by adding

btl_tcp_if_include = ifacename,0.0.0.0/24**
oob_tcp_if_include = ifacename,0.0.0.0/24**

which in my case is just

btl_tcp_if_include = bond0
oob_tcp_if_include = bond0

Now OpenMPI uses bond0 only.

chicks
  • 3,793
  • 10
  • 27
  • 36