Questions tagged [sungridengine]

Oracle Grid Engine, previously known as Sun Grid Engine (SGE), CODINE (Computing in Distributed Networked Environments) or GRD (Global Resource Director), is an open source batch-queuing system, developed and supported by Sun Microsystems. Sun once also sold a commercial product based on SGE, known as N1 Grid Engine (N1GE).

Grid Engine was previously developed and supported by Sun Microsystems. Sun once also sold a commercial product based on SGE, known as N1 Grid Engine (N1GE). With the purchase of Sun by Oracle it was forked and there are currently three actively maintained forks: Univa Grid Engine, Son of Grid Engine and Scalable Grid Engine/Open Grid Scheduler.

Until recently Oracle offered a version known as Oracle Grid Engine but support has been transferred to Univa along with the copyrights and it is expected that the Oracle version will be folded into Univa Grid Engine. It was previously known as Sun Grid Engine (SGE), CODINE (Computing in Distributed Networked Environments) or GRD (Global Resource Director), and is an open source batch-queuing system,

The Scalable Grid Engine and Son of Grid Engine versions are open source and free to use under the Sun Industry Standards Source License.

The Univa Grid Engine and Oracle Grid Engine forks are proprietary and apart from time limited demo versions only available with a support contract.

Scalable Logic offers an optional support contract for the Scalable Grid Engine version.

SGE is typically used on a computer farm or high-performance computing (HPC) cluster and is responsible for accepting, scheduling, dispatching, and managing the remote and distributed execution of large numbers of standalone, parallel or interactive user jobs. It also manages and schedules the allocation of distributed resources such as processors, memory, disk space, and software licenses.

SGE is the foundation of the Sun Grid utility computing system, made available over the Internet in the United States in 2006, later becoming available in many other countries.

332 questions
0
votes
1 answer

Starting many unrelated jobs in parallel on Grid Engine?

I often want to start a set of entirely unrelated ("embarrassingly" parallel) jobs on a Grid Engine cluster, for load-balancing purposes. What I do at the moment, I generate one Bash script for each job and then submit each of them separately, all…
user438602
0
votes
0 answers

Open Grid Scheduler/Sun Grid Engine qrsh bad exit code on halt/reboot

I use OGS on spot instances through qrsh calls. To have my program work properly, I need to be able to know when a job has failed due to a system shutdown (me losing the spot instance). If we execute a remote command via ssh and the remote system…
Finch_Powers
  • 2,938
  • 1
  • 24
  • 34
0
votes
1 answer

How do I suppress error and output log files in SGE

I'm running code in a Sun Grid Engine batch system that generates large log files. I can choose the output locations using the -o and -e options, but would like to know if I can tell it not to record the output at all.
invisiblerhino
  • 840
  • 1
  • 10
  • 18
0
votes
1 answer

How to set sun grid engine scheduling policy to satisfy this?

We use sun grid engine(actually open scheduler grid) as drms. Suppose we have 3 users: uA, uB, uC. uA submit 100000 jobs then uB submit 10 jobs then uC submit 1 job. With default scheduling policy, grid engine will run uA's 100000 jobs and then…
zhangailin
  • 926
  • 2
  • 10
  • 20
0
votes
1 answer

Run a multithreaded job in 1 slot?

What happens if I would try to run a multithreaded job in 1 SGE slot? Would it fail to start multiple threads? Or would it still start these multiple threads and potentially overload the SGE cluster node, because it is going to run more threads than…
WillamS
  • 2,457
  • 6
  • 24
  • 23
0
votes
1 answer

Prevent execution of non-SGE programs

From the point of view of the system administration of an SGE node, is it possible to force users to run long-running programs through qsub instead of running it stand-alone? The problem is that the same machine is acting as the control node and the…
Ray
  • 880
  • 1
  • 10
  • 18
0
votes
2 answers

Error while using -N option with qsub

I tried to use qsub -N "compile-$*" in Makefile and it gives the following error because $* equals to "compile-obj/linux/flow" in this case. qsub: ERROR! argument to -N option must not contain / The whole command which I am using is:- qsub -P…
crazy_prog
  • 1,085
  • 5
  • 19
  • 34
0
votes
1 answer

efficient way to wait for job completion : python and drmaa

I wanted to ask about "wait" feature in drmaa API I am using through Python. Does it do constant qstat's ( if we are running it on SGE) to check whether a program has finished execution. Our admin want us to avoid any constant qstat's as it slows…
Abhi
  • 6,075
  • 10
  • 41
  • 55
0
votes
3 answers

Problem in using C dynamic loading routines

I have an application consisting of different modules written in C++. One of the modules is meant for handling distributed tasks on SunGrid Engine. It uses the DRMAA API for submitting and monitoring grid jobs.If the client doesn't supports grid,…
sud03r
  • 19,109
  • 16
  • 77
  • 96
0
votes
1 answer

Avoid printing job exit codes in SGE with option -sync yes

I have a Perl script which submits a bunch of array jobs to SGE. I want all the jobs to be run in parallel to save me time, and the script to wait for them all to finish, then go on to the next processing step, which integrates information from all…
0
votes
1 answer

Error while opening shared object: SunGrid Engine

My application uses the Sun N1 grid engine through the API DRMAA present as shared object libdrmaa.so . I am using dlopen and dlsym to acess functions of the library. That works fine. Now if I try to link it form command line the executable is built…
sud03r
  • 19,109
  • 16
  • 77
  • 96
0
votes
1 answer

Sun Grid Engine suspend instead of restart jobs

This may be a cluster specific issue that can only be addressed by an admin, but when I have a low priority job and a high priority one comes along, the process is killed. When the high priority job finishes, the low priority job is restarted. Is…
zje
  • 3,824
  • 4
  • 25
  • 31
0
votes
2 answers

Using AWK to read line from file and create a variable

I have a text file with a list of filenames. I would like to create a variable from a specific line number using AWK. I get the correct output using: awk "NR==\$Line" /myPath/fileList.txt I want to assign this output to a variable and from…
Sara
  • 1
  • 1
  • 1
  • 3
-1
votes
1 answer

How to detect which HPC scheduler (Torque, Sun Grid Engine etc) am I using?

I need to run a different script depending on the type of scheduler, which necessitates a reliable way to detect if the scheduler is Torque, SGE or something else. Something like $SHELL telling which shell I am using. or something like name. I am…
-1
votes
2 answers

Batch job on lustre is not working: awk: cmd. line:1 "unexpected newline or end of string"

I have recently started running Python batch jobs on a Lustre system. I have recently made changes to the shell script, resulting in the error: awk: cmd. line:1: NR== awk: cmd. line:1: ^ unexpected newline or end of string The script is as…
user3140106
  • 347
  • 4
  • 16
1 2 3
22
23