4

I want to go into a container with singularity and then run slurm commands. For example:

singularity shell docker://tensorflow/tensorflow:1.0.0-gpu-py3

then within it run the script that I do want to run my script:

python tf_test.py

the contents of tf_test is:

import tensorflow as tf
print(tf.random_uniform((3,2)))

the issue I have is that the container doesn't know I am in a HPC or that slurm exists. Is it possible to only run slurm commands after we are in the container? I am particularly interested in using sbatch. Using srun and then going in the container is cheating and not what I want.

ceztko
  • 14,736
  • 5
  • 58
  • 73
Charlie Parker
  • 5,884
  • 57
  • 198
  • 323
  • I found [this description](https://info.gwdg.de/wiki/doku.php?id=wiki:hpc:usage_of_slurm_within_a_singularity_container) of directories that need to be bound and configuration inside the container. – paleonix Jun 02 '22 at 08:43

1 Answers1

2

Not sure on version you're running, but this should work for the 2.4.x series.

You can install slurm in the container, or if it's mounted on your cluster at say:

/apps/sched/slurm/[ver]

You can use the -B / --bind option to bind mount it like:

singularity shell -B /apps/sched/slurm/[ver] -B /etc/slurm

But, the job will not be in the container when ran. To force that you can can submit a run script that executes something like:

singularity exec docker://tensorflow/tensorflow:1.0.0-gpu-py3 python /path/to/tf_test.py

Edit: Once you get this where you're happy with the run, IMO it'd be good to build a Singularity image from the Docker source. In the definition file, setup a %runscript section like

%runscript
    python "$@"

You can then just submit:

/path/to/imagename.img /path/to/tf_test.py

Singularity images can be ran like an application, and by default it will execute whatever is in the %runscript section.

Jason Stover
  • 21
  • 1
  • 2