1

I am new to cluster computation and I want to repeat one empirical experiment 100 times on python. For each experiment, I need to generate a set of data and solve an optimization problem, then I want to obtain the averaged value. To save time, I hope to do it in parallel. For example, suppose I can use 20 cores, I only need to repeat 5 times on each core.

Here's an example of a test.slurm script that I use for running the test.py script on a single core:

#!/bin/bash
#SBATCH --job-name=test        
#SBATCH --nodes=1               
#SBATCH --ntasks=1              
#SBATCH --cpus-per-task=1      
#SBATCH --mem=4G                 
#SBATCH --time=72:00:00          
#SBATCH --mail-type=begin       
#SBATCH --mail-type=end         
#SBATCH --mail-user=address@email

module purge
module load anaconda3/2018.12
source activate py36

python test.py

If I want to run it in multiple cores, how should I modify the slurm file accordingly?

1 Answers1

0

To run the test on multiple cores, you can use srun -n option. After -n specify the number of processes, you need to launch.

srun -n 20 python test.py

srun is the launcher in slurm.

Or you can change the ntasks, cpus-per-task in slurm file. The slurm file will look like this:

#!/bin/bash
#SBATCH --job-name=test        
#SBATCH --nodes=1               
#SBATCH --ntasks=20              
#SBATCH --cpus-per-task=1      
#SBATCH --mem=4G                 
#SBATCH --time=72:00:00          
#SBATCH --mail-type=begin       
#SBATCH --mail-type=end         
#SBATCH --mail-user=address@email

module purge
module load anaconda3/2018.12
source activate py36
python test.py
j23
  • 3,139
  • 1
  • 6
  • 13
  • Thanks for your reply! I tried to change `ntasks` to 20, but it seems that the processing time is not reduced. I guess I should change my `test.py` file too? In my `test.py` file, I use `while` loop to repeat 100 times, which seems to be one task. What should I do to change it into 20 parallel task? – aurora_borealis Mar 05 '22 at 02:06
  • @aurora_borealis We can approach this in different ways.you can use [python multiprocessing](https://docs.python.org/3/library/multiprocessing.html). This cannot be discussed effectively within the comment context. Perhaps [this](https://stackoverflow.com/questions/39974874/using-pythons-multiprocessing-on-slurm) SO thread might be interesting to you. – j23 Mar 05 '22 at 06:56
  • If you don’t want to dive into multiprocessing. One easy hack will be make your python code test.py take loop index (you have a while loop inside this file right? While loop should work based on this index) from command line and run based on this index. So when you launch the python code, you should use a for loop inside the slurm file. In that, loop you will launch the *srun -n 1 test.py loop_index* 20 times. The loop_index argument should have information that will be used by your test.py for performing tasks. – j23 Mar 05 '22 at 07:05
  • 1
    I used python multiprocessing and solved it, thank you! – aurora_borealis Mar 07 '22 at 11:22
  • @aurora_borealis that’s great to hear :) – j23 Mar 07 '22 at 11:42