1

I am trying to run AMBER16 on a cluster but it is not working when the job is submitted through the scheduler, using the "qsub" command. However, the job does work when running locally on the front node. I have all of the PATHS set correctly in the .bashrc file. The following is my code:

#!/bin/bash
#PBS -N testAmber
#PBS -l nodes=1:ppn=12
#PBS -l walltime=05:00:00

cd working_directory
export AMBERHOME=/state/partition1/apps/amber16

source $AMBERHOME/amber.sh

mpirun -np 12 $AMBERHOME/bin/sander.MPI -O -i ...etc...

When this is submitted, I get the following error messages:

.../.bashrc: line 46: /state/partition1/apps/amber16/amber.sh: No such file or directory
/var/spool/torque/mom_priv/jobs/...: line 16: /state/partition1/apps/amber16/amber.sh: No such file or directory

mpirun was unable to launch the specified application as it could not access
or execute an executable:

Executable: /state/partition1/apps/amber16/bin/sander.MPI
Node: compute-0-8.local

while attempting to start process rank 0.

I've been trying to find a solution for hours, but am stuck. Please help :(

Astronomer
  • 119
  • 4
  • does `/state/partition1/apps/amber16/amber.sh` exist on your compute node ? (e.g. is the filesystem mounted ?) – Gilles Gouaillardet Mar 09 '18 at 16:49
  • It should be mounted, because I can "cd" into it directly on the front node, and vim the file (although source code). – Astronomer Mar 09 '18 at 17:07
  • you missed my point. when you submit a job, it typically run on a compute node that is different from the front node, so the same file might not be available (it could be a choice or an intermittent error). I suggest you add some debug (e.g. `df -h; ls -l ...`) at the beginning of your script to double check that. – Gilles Gouaillardet Mar 10 '18 at 00:23
  • @GillesGouaillardet I see. I messaged the server admin, and it seems that this was the case. It turns out there was another directory under /share/ for the same executable. Why did he/she have to do this, instead of using the link as is? – Astronomer Mar 11 '18 at 22:43
  • I cannot answer for your sysadmin ... all I can say is `PBS` will search for `/state/partition1/apps/amber16/amber.sh` on the first allocated compute node and `mpirun` will search for `/state/partition1/apps/amber16/bin/sander.MPI` on **all** your compute nodes. – Gilles Gouaillardet Mar 12 '18 at 00:05
  • So it is up to the admin to make sure all filesystems that should be mounted are indeed mounted at the right place, and up to the end user to make sure the files can be found (e.g. if a path/file contains a symlink to a filesystem that is not mounted on the compute nodes, then this is an end user error). I recommend you have a chat with your sysadmin and explain the `MPI` requirements. The filesystem policy might be revisited, or you might have to adapt your `PBS` scripts. – Gilles Gouaillardet Mar 12 '18 at 00:06

0 Answers0