5

Open MPI: 4.0.1a

HostFile:

  • 34bb0519eAAA
  • a2935f150BBB

I am in machine 34bb0519eAAA. And I could use ssh a2935f150BBB to connect a2935f150BBB successfully. And also ssh 34bb0519eAAA In machine a2935f150BBB to connect 34bb0519eAAA successfully .

But when I mpiexec command . I get error message

****Warning: Permanently added '[XX.XX.XX.XX]:XX' (a2935f150BBB'IP address) to the list of known hosts.**
----------------------**--------------------------------------
A process or daemon was unable to complete a TCP connection
to another process:
  Local host:    a2935f150BBB
  Remote host:   34bb0519eAAA
This is usually caused by a firewall on the remote host. Please
check that any firewall (e.g., iptables) has been disabled and

ORTE was unable to reliably start one or more daemons.
This usually is caused by:

* not finding the required libraries and/or binaries on
  one or more nodes. Please check your PATH and LD_LIBRARY_PATH
  settings, or configure OMPI with --enable-orterun-prefix-by-default

* lack of authority to execute on one or more specified nodes.
  Please verify your allocation and authorities.

* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
  Please check with your sys admin to determine the correct location to use.

*  compilation of the orted with dynamic libraries when static are required
  (e.g., on Cray). Please check your configure cmd line and consider using
  one of the contrib/platform definitions for your system type.

* an inability to create a connection back to mpirun due to a
  lack of common network interfaces and/or no route found between
  them. Please check network connectivity (including firewalls
  and network routing requirements).

I am very confused that.Because I run ssh to each other successfully . How could fail that.

Here is ssh connection ssh a2935f150BBB
Warning: Permanently added '[XX.XX.XX.XX]:XX to the list of known hosts. Welcome to Ubuntu 18.04.1 LTS (XXXXXXXXXXXXXXXXXX)

This system has been minimized by removing packages and content that are not required on a system that users do not log into.

To restore this content, you can run the 'unminimize' command. Last login:XXXXXXXXXXXXX from XXXXXXXXXX

NoDirection
  • 122
  • 4
  • 11
  • Open MPI requires more than just SSH in order to work, hence the help message about firewalls. The first warning is about SSH is suspicious since you claim you were able to SSH to the other node before invoking `mpiexec` (but that should not prevent Open MPI from working though) – Gilles Gouaillardet Jan 24 '19 at 02:37
  • I check the firewalll sudo ufw status Status: inactive – NoDirection Jan 24 '19 at 02:46
  • try `sudo iptables -L` on **both** nodes – Gilles Gouaillardet Jan 24 '19 at 02:49
  • Hi @GillesGouaillardet, I have the same problem... I am not sure how to use `iptables`. When I tried `sudo iptables -t nat -L`, I got `Chain PREROUTING (policy ACCEPT) target prot opt source destination Chain INPUT (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination Chain POSTROUTING (policy ACCEPT) target prot opt source destination `. How to disable the firewall? Should I use `sudo iptables -F`? – Joxixi Mar 28 '21 at 13:33
  • At first glance, there is no firewall. – Gilles Gouaillardet Mar 28 '21 at 18:16

0 Answers0