Questions tagged [lsf]

LSF, aka Load Sharing Facility is software that executes batch jobs on networked Unix and Windows systems on many different architectures. It is commonly used in HPC Clusters in many universities and research centers around the world.

The Load Sharing Facility or LSF is a set of tools for distributing jobs across a set of networked systems. If was developed initially by the Platform Computing company, which was acquired by IBM in 2012. It is now called IBM Spectrum LSF

Resources:

229 questions
2
votes
1 answer

LSF: about requesting nodes, exclusively selecting nodes and running mpirun

I am very confused about submitting a job on a multi-user cluster environment. I use a script with the following head #BSUB -L /bin/bash #BSUB -n 10 #BSUB -J jobname #BSUB -oo log/output.%J #BSUB -eo log/error.%J #BSUB -q queue_name #BSUB -P…
simona
  • 2,009
  • 6
  • 29
  • 41
2
votes
0 answers

Linking between dask.distributed and LSF cluster

I'm using IBM's LSF platform to run my code in parallel. At the moment, this entails "manually" breaking the code into a job array; instead of: for i in range(100): x[i] = f(i) I distribute f over 100 workers, and then "manually" collect all…
Adam Haber
  • 683
  • 1
  • 6
  • 8
2
votes
1 answer

multi-user cluster: IBM Platform LSF: user changing priority of jobs

I am user of a multi-user cluster that uses IBM Platform LSF (on Linux). I would like to change the priority of my jobs with respect to my same jobs (not the absolute priority of the jobs in the queue). An example: I have launched 500 jobs in the…
simona
  • 2,009
  • 6
  • 29
  • 41
2
votes
1 answer

Email alert when LSF job array is finished running

I'm submitting a large-ish job array that might take a few hours to run (but might also fail with an error a few minutes in), and I'd like to get an email when it's done. If I don't set the -oo flag on bsub to a file, it will send me an email when…
Empiromancer
  • 3,778
  • 1
  • 22
  • 53
2
votes
1 answer

Is there any way reduce the I/O wait times in linux jobs?

I am running a multiple parallel file handling process (minimum 200 processes) where each process is reading a logs of varying size(0-50mb) to capture a real time data of the logs .I am running my jobs on 16 cores and 8 gb ram Linux machine. But…
vikas chib
  • 83
  • 8
2
votes
2 answers

How to create a correct array, that can be referenced via char** to pass parameters to LSF-API

I want to use a small C wrapper to access a so called LSF-API. LSF is the "load sharing facility" which is something like a platform to dispatch computing jobs on various machines (created by IBM). I figured out how to do basic job submitting…
and0r
  • 305
  • 1
  • 4
  • 13
2
votes
4 answers

having a job run only after all my previous jobs have finished

I found a post indicating how I might tell bsub to wait for a specified set of jobs to finish before running here, however this only works if one knows the number of jobs before hand. I would like to run an arbitrary number of jobs, and run a…
kmace
  • 1,994
  • 3
  • 23
  • 39
2
votes
1 answer

Change priorities of my own submitted jobs

I have many jobs running and pending. I would like to indicate the relative priority of jobs that I have submitted to the queue, that are pending, but not yet running. Is it possible to set this priority after submission? Is it possible to set…
gerrit
  • 24,025
  • 17
  • 97
  • 170
2
votes
1 answer

IBM Platform LSF Exit code=139

I've faced with error while executing SAS batch command. Batch command executes by IBM Platform LSF. bhist command shows following: The job exited with exit code 139. According to LSF admin guide jobs terminated with a system signal are returned by…
Igor Khalin
  • 31
  • 1
  • 3
2
votes
1 answer

PID of command submitted to LSF with bsub

When a command is submitted with bsub, it will start a process with res command. res in turn will start actual command as another process I want to know pid of this actual command let's say, I have submitted this command. With bhist -l jobid, we can…
2
votes
1 answer

Passing job array index as an argument in drmaa-python

I am using a lsf-drmaa implementation and interfacing through drmaa-python. I usually pass in the environment variable, $LSB_JOBINDEX, into my run.sh script as an argument. Through drmaa-python, I created JobTemplate jt and would like to pass it…
user1575175
  • 91
  • 1
  • 1
  • 4
2
votes
1 answer

Changing python version on platform LSF job script on Linux server

I want to execute my python code on LSF, and problem is that the return of import sys print (sys.version) in lsf is 2.6.6 (r266:84292, Jul 22 2015, 16:47:47) [GCC 4.4.7 20120313 (Red Hat 4.4.7-16)] But my code has been written for python2.7.…
ehsan badakhshan
  • 135
  • 2
  • 11
2
votes
0 answers

Getting the running jobs on an LSF cluster using python and PlatformLSF

I'm trying to write a simple task manager in python that will be used to run a large number of jobs in an LSF cluster. I'm stuck trying to determine (within a python script) the number of running jobs for a given user. On the command line this…
user41140
  • 133
  • 1
  • 8
2
votes
2 answers

bsub option confused with job arguments

I want to submit a job to LSF using the bsub command. One of the job argument is "-P argument_1". So the overall command looks like bsub -P project_name -n 4 -W 10:00 my_job -P argument_1 But bsub considers -P argument_1 as the project_name instead…
2
votes
2 answers

BSUB many matlab job to a cluster?

I am using the following bash file to submit matlab job to a cluster, #!/bin/bash #BSUB -L /bin/bash #BSUB -J matlab.01 #BSUB -q long #BSUB -n 32 #BSUB -R "span[hosts=1]" #BSUB -W 20:00 #BSUB -R "rusage[mem=3072]" #BSUB -o %J.out #BSUB -e…
XAM
  • 21
  • 4
1 2
3
15 16