3

I'm looking for a way to submit an OpenMP job to a Grid Engine scheduler, while specifying the number of cores it should run on. Something equivalent to LSF's -n option, or PBS's -l nodes=[count] option.

When I search on this, I'm see a bunch of answers specifying syntax like "-pe threaded [number of cores]". In those answers, there is no mention of having to create a parallel environment called "threaded". But when I try this syntax, it fails, saying that the requested parallel environment threaded does not exist. And when I type "qconf -spl", the only result I get is "make". So - should this "threaded" parallel environment exist by default, or is this something that has to be manually created on the cluster?

If it has to be manually created, is there any other syntax to submit jobs to multiple cores that does not rely on configurable naming on a cluster? This is for a third party program submitting to a cluster, so I don't want to have to rely not only on the client having created this pe, but naming it the same, etc... I was hoping the -l option might have something, but I haven't been able to find any permutation of that to achieve this.

valiano
  • 16,433
  • 7
  • 64
  • 79
teleute00
  • 561
  • 6
  • 25
  • 1
    The managers of a Grid Engine installation define and configure the pe's (parallel environments) so, no, you shouldn't expect a pe called `threaded` to exist by default. As for the rest of the question, I can't help, I'm in the fortunate position of having experts administer the clusters I use so don't have to concern myself with too much of the nitty-gritty. – High Performance Mark Jan 15 '14 at 14:28

1 Answers1

5

If you get only "make" as possible parallel environment then this means that there are no parallel environments set on your cluster.

There are two solutions to your problem, depending on these 2 situations:

A) you have root/admin access to the cluster

B) you don't

In case B, well ask your administrator to create a parallel environment. In case A, you have to create a parallel environment. To create a new parallel environment you must type (requires root/admin privilege):

qconf -ap <pe_name>

And the default editor will start with a default pe_conf file that you must edit. If you need to setup only an openMP parallel environment you can use these options:

pe_name            smp
slots              9999
user_lists         NONE
xuser_lists        NONE
start_proc_args    /bin/true
stop_proc_args     /bin/true
allocation_rule    $pe_slots
control_slaves     FALSE
job_is_first_task  FALSE
urgency_slots      min
accounting_summary TRUE

and for a MPI parallel environment:

pe_name            mpi
slots              9999
user_lists         NONE
xuser_lists        NONE
start_proc_args    /opt/sge/mpi/startmpi.sh $pe_hostfile
stop_proc_args     /opt/sge/mpi/stopmpi.sh
allocation_rule    $fill_up
control_slaves     FALSE
job_is_first_task  TRUE
urgency_slots      min
accounting_summary TRUE

as you notice, in the latter case you will point SGE to the right initialization script and shutdown script for your MPI configuration. In the first case, you simply point to /bin/true.

The allocation_rule are different in this example. $fill_up means that SGE will fill any CPU it can find with parts of the MPI job, while for smp configuration you simply allocate the correct number of slots on the same machine, i.e. $pe_slots.

If you use MPI, your nodes should be connected using a high performance switch such as infiniband otherwise your jobs will spend much more time communicating than calculating.

EDIT: oh, btw: the correct synthax to submit a job with a parallel environment is effectively:

qsub -pe <pe_name> <nb_slots>

FINAL EDIT: the final answer to the question comes in the comments here below. In practice, SGE cannot handle multi-thread jobs if a parallel environment (PE) is not set on the cluster. If you do not have admin privileges on the cluster, you must either guess for the correct PE that has to be used using qconf -spl and inspect the different PEs with qconf -sp <pe_name>, or add an option in your software that allows the users to specify the PE that has to be used.

Otherwise, i.e. if no PE are available on the cluster, you cannot use a parallel version of your software.

See the comments for further information.

Danduk82
  • 769
  • 1
  • 10
  • 29
  • As mentioned in OP, we create a third party program that interacts with schedulers. So we were very much hoping to find a way that doesn't depend on factors outside of our control - i.e. whether or not the client has configured these pe's, and named them a certain way, etc... We don't have any access to these client systems. We're able to do this with an out-of-the-box flag in all the other schedulers - this "having to configure parallel envs" thing seems to be quite specific to SGE. Is that 100% the only way to do this? If so, then we effectively have no solution that will work for us. :-( – teleute00 Jan 16 '14 at 21:19
  • That being said - your OpenMP example there. You have slots as "9999". Will that then work on any system? (i.e. could we include a script that the admin could run that would just set this up for them, without us needing to know anything about the systems, calc the various permutations, etc...?) – teleute00 Jan 16 '14 at 21:23
  • The slots 9999 is not a problem, provided that the number of effective slots is dependent on the queue that is being used. – Danduk82 Jan 16 '14 at 21:29
  • Ah, OK. I better understand your problem now. No, you will not be able to use a thread number option without a parallel environment using SGE. But maybe you could add an option for your software where the user can specify the pe_name that has to be used? – Danduk82 Jan 16 '14 at 21:31
  • A more hacky way could be to *guess* the parallel environment by executing `qconf -spl` in a `system()` call, to list the available parallel environments, then iterate on the several PE available and looking to their configuration with `qconf -sp ` and look for those that have what you need. Then just use one of them... – Danduk82 Jan 16 '14 at 21:38
  • We could add the ability to specify a pe, but our users are exceptionally unlikely to know that. They're just end users who often have no exposure to clusters at all. We could query for it, as mentioned, but that's more complex - and also still means that we have to rely on the client's admin to have set up a pe. That was why I was wondering whether setting the pe up could be generically scripted - something we could just give to them to run and set it all up automatically, that should work regardless of their hardware. – teleute00 Jan 16 '14 at 21:43
  • If the sysadmin of a particular clustuer didn't setup a PE you should simply not use openMP parallelization on that cluster but a serial verison or single-cpu version of your code. Otherwise the gridengine cannot consider the right number of cpus for your job. Anyway, on every serious cluster you will always find some PE. In fact the problem will probably be to find the good one. At least look for `allocation_rule == $pe_slots` in the PE description if you want to use OpenMP. – Danduk82 Jan 16 '14 at 21:49
  • Not the answer I wanted, but since the answer I wanted was that the scheduler has a capability it doesn't, this appears to be the correct answer nonetheless. :-) I'll mark as answer, but would be cool if you had a sec to edit the answer just to mention that the answer is further expounded upon/clarified in the comments (for reference of anyone looking here in future, since original answer was before the issue was completely clear). Thanks! – teleute00 Jan 17 '14 at 01:22