0

This is a follow on from my question that couldn't (seemingly) be resolved here . The problem I have is I'm developing a Python script to run on a cluster that needs to call up external scripts / programs and wait for them to finish before continuing (otherwise the data it needs won't be there).

To run a script on the cluster it needs to be submitted by a shell script. This is what causes me the problem, as I execute the shell via Python but the shell runs the script/program and closes itself before said script/program finishes. This makes Python think it's done and so it continues.

I've spent longer than I care to admit trying to solve this via google and this website and am drawing a blank as it seems like it SHOULD wait, and it just doesn't.

I've enclosed a sample shell script and would be willing to try almost anything.

EDIT FOR CLARIFICATION:

This script is an example .sh file that is used to launch a script/program. In this example it is launching a python script called "Test.py". This .sh file would be submitted to the cluster via the queue system using the command "qsub ****.sh" where **** is the shell name. This is done via the terminal, or in my case, via another Python script (the one that needs to know when this other process is done).

This shell script does not wait for Test.py to finish before exiting. This is what causes me all the problems.

#!/bin/bash
#
#PBS -l walltime=1:59:59,nodes=1:ppn=1
#PBS -j oe
#PBS -m ae
#
#-------------------------------------------------
#
cd ${HOME}/ERA-Primary-Folder
#
python ./Test.py

Sorry if doing a new question following on from my own is against the rules here. It just seems like a fresh issue as my original question looked at the Python side of it, whilst this focuses on the shell script.

Any questions please just ask. Thanks in advance to anyone who tries to help me!

Community
  • 1
  • 1
Steve
  • 614
  • 1
  • 10
  • 20
  • What is the script you wrote here showing us? This is an example script that gets run on the cluster that you need to wait for? Does this script exit immediately or only exit once the python script has finished? – Etan Reisner Mar 09 '15 at 18:26
  • Sorry for any confusion, I'll add the below clarification to the main post: This script is an example .sh file that is used to launch a script/program. In this example it is launching a python script called "Test.py". This .sh file would be submitted to the cluster via the queue system using the command "qsub ****.sh" where **** is the shell name. This is done via the terminal, or in my case, via another Python script (the one that needs to know when this other process is done). This shell script does not wait for Test.py to finish before exiting. This is what causes me all the problems. – Steve Mar 09 '15 at 18:43
  • That shell script almost certainly does wait. `qsub`, on the other hand, quite likely does not. You likely need to poll for the job to finish (unless you can explicitly wait for it with some other q* command). – Etan Reisner Mar 09 '15 at 18:57
  • There's the possibility I might be able to somehow get Python to collect the job number from the q system, if so then I might be able to get it to check if that job is still running. No idea how I'd do that though! :P Sorry for my ignorance, but what is polling for the job to finish? – Steve Mar 09 '15 at 19:01
  • What you just said. Continually (with some delay) asking the system if the job is finished. Better would be a `qwait` or whatever command that you can run which will not return until the job is finished. – Etan Reisner Mar 09 '15 at 19:04
  • Ok brilliant, I'll look into if this is feasible on the cluster! Thanks a million for the help! – Steve Mar 09 '15 at 19:10

0 Answers0