How to use GNU parallel (bash scripting) with aprun command on Cray XE6 compute nodes (Unix like env)?

Question

I am trying to run 16 instances on mpi4py python script: hello.py. I stored in s.txt 16 commands of this sort:

python /lustre/4_mpi4py/hello.py > 01.out

I am submitting this in Cray cluster via aprun command like this:

aprun -n 32 sh -c 'parallel -j 8 :::: s.txt'

My intention was to run 8 of those python jobs per node at the time.The script was running more than 3 hours and none of *.out files was created. From PBS scheduler output file I am getting this:

Python version 2.7.3 loaded
aprun: Apid 11432669: Caught signal Terminated, sending to application
aprun: Apid 11432669: Caught signal Terminated, sending to application
parallel: SIGTERM received. No new jobs will be started.
parallel: SIGTERM received. No new jobs will be started.
parallel: Waiting for these 8 jobs to finish. Send SIGTERM again to stop now.
parallel: Waiting for these 8 jobs to finish. Send SIGTERM again to stop now.
parallel: SIGTERM received. No new jobs will be started.
parallel: SIGTERM received. No new jobs will be started.
parallel: Waiting for these 8 jobs to finish. Send SIGTERM again to stop now.
parallel: Waiting for these 8 jobs to finish. Send SIGTERM again to stop now.
parallel: python /lustre/beagle2/ams/testing/hpc_python_2015/examples/4_mpi4py/hello.py > 07.out
parallel: python /lustre/beagle2/ams/testing/hpc_python_2015/examples/4_mpi4py/hello.py > 03.out
parallel: python /lustre/beagle2/ams/testing/hpc_python_2015/examples/4_mpi4py/hello.py > 09.out
parallel: python /lustre/beagle2/ams/testing/hpc_python_2015/examples/4_mpi4py/hello.py > 07.out
parallel: python /lustre/beagle2/ams/testing/hpc_python_2015/examples/4_mpi4py/hello.py > 02.out
parallel: python /lustre/beagle2/ams/testing/hpc_python_2015/examples/4_mpi4py/hello.py > 04.out
parallel: python /lustre/beagle2/ams/testing/hpc_python_2015/examples/4_mpi4py/hello.py > 06.out
parallel: python /lustre/beagle2/ams/testing/hpc_python_2015/examples/4_mpi4py/hello.py > 09.out
parallel: python /lustre/beagle2/ams/testing/hpc_python_2015/examples/4_mpi4py/hello.py > 09.out
parallel: python /lustre/beagle2/ams/testing/hpc_python_2015/examples/4_mpi4py/hello.py > 01.out
parallel: python /lustre/beagle2/ams/testing/hpc_python_2015/examples/4_mpi4py/hello.py > 01.out
parallel: SIGTERM received. No new jobs will be started.
parallel: python /lustre/beagle2/ams/testing/hpc_python_2015/examples/4_mpi4py/hello.py > 10.out
parallel: python /lustre/beagle2/ams/testing/hpc_python_2015/examples/4_mpi4py/hello.py > 03.out
parallel: python /lustre/beagle2/ams/testing/hpc_python_2015/examples/4_mpi4py/hello.py > 04.out
parallel: python /lustre/beagle2/ams/testing/hpc_python_2015/examples/4_mpi4py/hello.py > 08.out
parallel: SIGTERM received. No new jobs will be started.
parallel: python /lustre/beagle2/ams/testing/hpc_python_2015/examples/4_mpi4py/hello.py > 03.out

I am running this on one node and it has 32 cores. I suppose my use of GNU parallel command is wrong. Can someone please help with this.

What is your Cray? Is it some kind of linux, and which one (including version)? Doew your script works without gnu `parallel` command? Why do you want to use the `parallel` command (what is the task; mpi is usually good in starting parallel jobs). — osgx, Apr 20 '17 at 00:38
It's a supercomputer. The goal is to run all 16 instances of python script only on 1 node, but because say the node has 32GB you can't run all jobs (16) at the same time (so I just run say 8 at the time), or say when your application is not threaded. In any case I have to use GNU parallel. But I am new to that syntax, and I assume my error is there. — user2458189, Apr 20 '17 at 00:50
Compute node, where I am running it has Unix like environment. It's Cray XE6. And my python script works, I tested it multiple times. — user2458189, Apr 20 '17 at 01:03
You have Cray's `aprun` to manage resources of your target PC; and not the GNU `parallel`. aprun can start several tasks for you: http://www.nersc.gov/users/computational-systems/retired-systems/hopper/running-jobs/aprun/aprun-man-page/. Think a moment about this: your python application is not single process app, it is multiprocess MPI program: https://portal.tacc.utexas.edu/documents/13601/1102030/4_mpi4py.pdf#page=8. And there is `aprun` to arrange some set of CPUs and start several MPI processes and give them **essential information** how to find each other. But GNU parallel knows nothing — osgx, Apr 20 '17 at 04:16
Why do you think that you must use gnu `parallel`? Don't do `aprun ... parallel ...mpi_application` as aprun does forks and gives information to mpi_application; it don't expect something to fork extra copies (extra copies makes your mpi lib/application to fail with SIGTERM/SIGSEGV). Only do `aprun ...mpi_application` or `parallel ... aprun ... mpi_application` or `parallel ... aprun ... NON_MPI_application` or `parallel .. `[`qsub`](http://www.hector.ac.uk/coe/cray-xe6-workshop-2013-June/pdf/03-Compiling%20and%20Launching.pdf) or try `aprun ... parallel ... NON_MPI_application`. — osgx, Apr 20 '17 at 04:56
Thank you that makes sense. Can you please tell me if I have multiple NON_MPI python commands inside s.txt would this be proper syntax with aprun? aprun -n 32 sh -c 'parallel -j 8 :::: s.txt' — user2458189, Apr 20 '17 at 18:43

score 1 · Accepted Answer · answered Apr 20 '17 at 04:53

As listed in https://portal.tacc.utexas.edu/documents/13601/1102030/4_mpi4py.pdf#page=8

from mpi4py import MPI

comm = MPI . COMM_WORLD

print " Hello ! I’m rank %02d from %02 d" % ( comm .rank , comm . size )

print " Hello ! I’m rank %02d from %02 d" % ( comm . Get_rank () ,
comm . Get_size () )

print " Hello ! I’m rank %02d from %02 d" %
( MPI . COMM_WORLD . Get_rank () , MPI . COMM_WORLD . Get_size () )

your 4_mpi4py/hello.py program is not typical single process (or single python script), but multi-process MPI application.

GNU parallel expects simpler programs and don't support interaction with MPI processes.

In your cluster there are many nodes and every node may start different number of MPI processes (with 2 of 8-core CPU per node think about variants: 2 MPI processes of 8 OpenMP threads each; 1 MPI process of 16 threads; 16 MPI processes without threads). And to describe the slice of cluster to your task there is some interface between cluster management software and the MPI library used by python MPI wrapper used by your script. And the management is the aprun (and qsub?):

http://www.nersc.gov/users/computational-systems/retired-systems/hopper/running-jobs/aprun/aprun-man-page/

https://www.nersc.gov/users/computational-systems/retired-systems/hopper/running-jobs/aprun/

You must use the aprun command to launch jobs on the Hopper compute nodes. Use it for serial, MPI, OpenMP, UPC, and hybrid MPI/OpenMP or hybrid MPI/CAF jobs.

https://wickie.hlrs.de/platforms/index.php/CRAY_XE6_Using_the_Batch_System

The job launcher for the XE6 parallel jobs (both MPI and OpenMP) is aprun. ... The aprun example above will start the parallel executable "my_mpi_executable" with the arguments "arg1" and "arg2". The job will be started using 64 MPI processes with 32 processes placed on each of your allocated nodes (remember that a node consists of 32 cores in the XE6 system). You need to have nodes allocated by the batch system before (qsub).

There is some interface between aprun and qsub and MPI: in normal start (aprun -n 32 python /lustre/4_mpi4py/hello.py) aprun just starts several (32) processes of your MPI program, sets the id of each process in the interface and gives them the group id (for example, with environment variables like PMI_ID; actual vars are specific to launcher/MPI lib combination).

GNU parallel have no any interface to MPI programs, it know nothing about such variables. It will just start 8 times more processes than expected. And all 32 * 8 processes in your incorrect command will have same group id; and there will be 8 processes with same MPI process id. They will make your MPI library to misbehave.

Never mix MPI resource managers / launchers with ancient before-the-MPI unix process forkers like xargs or parallel or "very-advanced bash scripting for parallelism". There is MPI for doing something parallel; and there is MPI launcher/job management (aprun, mpirun, mpiexec) for starting several processes / forking / ssh-ing to machines.

Don't do aprun -n 32 sh -c 'parallel anything_with_MPI' - this is unsupported combination. Only possible (allowed) argument to aprun is program of some supported parallelism like OpenMP, MPI, MPI+OpenMP or non-parallel programs. (or single script of starting ONE parallel program)

If you have several independent MPI tasks to start, use several arguments to aprun: aprun -n 8 ./program_to_process_file1 : -n 8 ./program_to_process_file2 -n 8 ./program_to_process_file3 -n 8 ./program_to_process_file4

If you have multiple files to work on, try to start many parallel jobs, use not single qsub, but several and allow PBS (or which job manager is used) to manage your jobs.

If you have very high number of files, try not to use MPI in your program (don't ever link MPI libs / include MPI headers) and use parallel or other form of ancient parallelism, which is hidden from aprun. Or use single MPI program and program file distribution directly in your code (Master process of MPI may open file list, then distribute files between other MPI processes - with or without dynamic process management of MPI / mpi4py: http://pythonhosted.org/mpi4py/usrman/tutorial.html#dynamic-process-management).

Some scientists tries to combine MPI and parallel in other sequence: parallel ... aprun ... or parallel ... mpirun ...:

https://rcc.uchicago.edu/docs/tutorials/kicp-tutorials/running-jobs.html#gnu-parallel
http://www.hpc.lsu.edu/training/weekly-materials/2017-Spring/gnuparallel-Feb2017.pdf#page=41
and there is version of parallel for your Cray: https://github.com/levinas/cray-parallel

Thank you so much for your elaborate answer!!! Can you please tell me if I have multiple NON_MPI python commands inside s.txt would this be proper syntax with aprun? aprun -n 32 sh -c 'parallel -j 8 :::: s.txt' — user2458189, Apr 20 '17 at 14:29
I can't (never used aprun/cray). But with non-MPI python script (don't even import mpi4py; and check that compute nodes have updated script) there will be no clear conflict between `aprun` and `parallel`. You have access to cray and may try it, i think it may work. — osgx, Apr 20 '17 at 17:09
But, I don't know how aprun works; such command may start your `parallel` several times (32?) if used wrong. — osgx, Apr 20 '17 at 23:05

How to use GNU parallel (bash scripting) with aprun command on Cray XE6 compute nodes (Unix like env)?

1 Answers1