I have a script defined like this:
#!/bin/sh
#SBATCH --nodes=1
#SBATCH --cpus-per-task=16
#SBATCH --mem 180000
./program1 --threads 16
./program2 --threads 16
I then submit my job with sbatch job.sh
The thing is that program1 uses all 16 cores/cpus, but program2 does only use 1 (both are supposedly multi-thread). If I however modify the script to be like:
#!/bin/sh
#SBATCH --nodes=1
#SBATCH --cpus-per-task=16
#SBATCH --mem 180000
./program1 --threads 16
srun --mpi=openmpi ./program2 --threads 16
then program2 does also use all 16 cores. Why is it necessary to add that "srun"?
As extra information, the implementation of program2 multithreading is done using std::async