0

I am preparing some C++ code to be run on a cluster, managed by SLURM. The cluster takes one compiled file: a.out. It will then execute it on 500 different nodes via the JOB_ARRAY. When executing each copy of the file, it will need to read one input parameter, say double parameter. My idea was to prepare a .txt file which would hold one value of the parameter at each line. What is the best strategy for implementing the reading of these parameter values?

  1. a.out will read the value from the first line and immediately delete it. If this is the right strategy, how to ensure that two copies of a.out are not doing the same thing at the same time?

  2. a.out will read the value from the n-th line. How to let the copy of a.out know which n is it working with?

  3. Is there any better implementation strategy then the two above? If so, how to do this? Is C++ fstream the way to go, or shall I try something completely different.

Thank you for any ideas. I would appreciate if you also left some very simple code for how a.out shall look like.

2 Answers2

1

Option two is the best way to go: You can use $SLURM_ARRAY_TASK_ID to get the specific line, so the call in your jobscript is simply:

a.out $(head -n $SLURM_ARRAY_TASK_ID parameter.txt | tail -1)

This should get the line corresponding to the task array ID.

Marcus Boden
  • 1,495
  • 8
  • 11
  • 1
    See https://stackoverflow.com/questions/6022384/bash-tool-to-get-nth-line-from-a-file for alternative ways to extract the nth line in the file – damienfrancois Feb 02 '21 at 13:47
0

You could deploy a param file with single line witch each executable, that's the simplest solution, because you know in advance to how many nodes you're deploying.

Or you could have one node playing role of a service register. The service register would then distribute the params to the nodes (for example via networking). It could hold a list of params and every client (differentiable for ex. via IP) would get the next line from the file via this service.

StPiere
  • 4,113
  • 15
  • 24
  • Yes, the first option looks good. Just a question. Another way: I could also make a vector holding the parameters inside of the a.out code. How to transfer the SLURM_ARRAY_TASK_ID to the a.out during execution? I suspect there must be some bash command I do not know which would do this. – Greenhorn3.14 Jan 19 '21 at 10:49