1

I am learning how to use supercomputers to make the good use of resources. Let's say I have a python script, that will create a text file with given random number.

myfile.py

# Imports
import random,os

outdir = 'outputs'
if not os.path.exists(outdir):
    os.makedirs(outdir)

with open (outdir+'/temp.txt','w') as f :
    a = random.randint(0,9)
    f.write(str(a))

This will create only one text file in the local machine.
Is there any way I can use the multiple instances of this program, use multiple nodes and get multiple outputs?

I got a template for mpiexec in C program which looks like this, but I could not find any template for python program.

#PBS -N my_job
#PBS -l walltime=0:10:00
#PBS -l nodes=4:ppn=12
#PBS -j oe

cd $PBS_O_WORKDIR

mpicc -O2 mpi-hello.c -o mpi-hello

cp $PBS_O_WORKDIR/* $PFSDIR
cd $PFSDIR

mpiexec ./mpi-hello

cp $PFSDIR/* $PBS_O_WORKDIR

Note: On a single node using multiple cores I can write a bash script like this:

for i in `seq 1 10`;
    do
        python myfile.py && cp temp.txt outputs/out$i.txt &
    done

But I want to utilize different nodes.
Required output: outputs/out1.txt,out2.txt,out3.txt etc

Some related links are following:
https://www.osc.edu/sites/osc.edu/files/documentation/Batch%20Training%20-%2020150312%20-%20OSC.pdf
https://www.osc.edu/~kmanalo/multithreadedsubmission

Community
  • 1
  • 1
BhishanPoudel
  • 15,974
  • 21
  • 108
  • 169

1 Answers1

2

Take a look to this link it may solve your problem

http://materials.jeremybejarano.com/MPIwithPython/introMPI.html

so your code may be something like:

from mpi4py import MPI
import random,os

outdir = 'outputs'
comm = MPI.COMM_WORLD
rank = comm.Get_rank()

if not os.path.exists(outdir):
    os.makedirs(outdir)

with open (outdir+'/temp%s.txt' % rank,'w') as f :
    a = random.randint(0,9)
    f.write(str(a))

and the pbs file:

#!/bin/bash
################################################################################
#PBS -N myfile.py
#PBS -l nodes=7:ppn=4
#PBS -l walltime=30:30:00:00
#PBS -m bea
##PBS -M mail@mail.mail
###############################################################################

cores=$(awk 'END {print NR}' $PBS_NODEFILE)
mpirun -np $cores python myfile.py
efirvida
  • 4,592
  • 3
  • 42
  • 68
  • Thanks very very much for your answer. I will test this code in Ohio Super-computing Center. – BhishanPoudel Oct 14 '16 at 18:02
  • Will it also copies the different outputs in the final output, maybe it needs – pbsdcp, but i dont know how? – BhishanPoudel Oct 14 '16 at 18:06
  • @BhishanPoudel, I dont understand your question – efirvida Oct 14 '16 at 19:05
  • Sorry, i mean if I run this python code in laptop i get only one output, outputs/temp.txt, but if we use 7 nodes and 12 cores in each nodes, do we get 7 output files? e.g. outputs/temp0.txt,temp1.txt. etc? – BhishanPoudel Oct 14 '16 at 19:22
  • yes, you can test it in your own machine using this command line `mpiexec -n 5 python myfile.py` to run on 5 precces it create 5 files. – efirvida Oct 14 '16 at 19:26