0

Recently I installed Intel OneAPI including c compiler, FORTRAN compiler and mpi library and complied VASP with it.

Before presenting the question, there are some tricks I need to clarify during the installation of VASP:

  1. GLIBC2.14: the cluster is an old machine with a glibc version of 2.12, where OneAPI needs a version of 2.14. So I compile the GLIBC2.14 and export the ld_path: export LD_LIBRARY_PATH="~/mysoft/glibc214/lib:$LD_LIBRARY_PATH"
  2. ld 2.24: The ld version is 2.20 in the cluster, while a higher version is needed. So I installed binutils 2.24.

There is one master computer connected with 30 calculating nodes in the cluster. The calculation is executed with 3 ways:

  1. When I do the calculation in the master, it's totally OK.
  2. When I login the nodes manually with rsh command, the calculation in the logged node is also no problem.
  3. But usually I submit the calculation script from the master (with slurm or pbs), and then do the calculation in the node. In that case, I met following error message:
[mpiexec@node3.alineos.net] poll_for_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:159): check exit codes error
[mpiexec@node3.alineos.net] HYD_dmx_poll_wait_for_proxy_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:212): poll for event error
[mpiexec@node3.alineos.net] HYD_bstrap_setup (../../../../../src/pm/i_hydra/libhydra/bstrap/src/intel/i_hydra_bstrap.c:1062): error waiting for event
[mpiexec@node3.alineos.net] HYD_print_bstrap_setup_error_message (../../../../../src/pm/i_hydra/mpiexec/intel/i_mpiexec.c:1015): error setting up the bootstrap proxies
[mpiexec@node3.alineos.net] Possible reasons:
[mpiexec@node3.alineos.net] 1. Host is unavailable. Please check that all hosts are available.
[mpiexec@node3.alineos.net] 2. Cannot launch hydra_bstrap_proxy or it crashed on one of the hosts. Make sure hydra_bstrap_proxy is available on all hosts and it has right permissions.
[mpiexec@node3.alineos.net] 3. Firewall refused connection. Check that enough ports are allowed in the firewall and specify them with the I_MPI_PORT_RANGE variable.
[mpiexec@node3.alineos.net] 4. pbs bootstrap cannot launch processes on remote host. You may try using -bootstrap option to select alternative launcher.

I only met this error with oneAPI compiled codes but Intel® Parallel Studio XE compiled. Do you have any idea of this error? Your response will be highly appreciated.

Best,

Léon

Léon
  • 1

1 Answers1

0

Could it be a permissions error with the Slurm agent not having the correct permissions or library path?

TonyM
  • 366
  • 1
  • 4