Questions tagged [lsf]

LSF, aka Load Sharing Facility is software that executes batch jobs on networked Unix and Windows systems on many different architectures. It is commonly used in HPC Clusters in many universities and research centers around the world.

The Load Sharing Facility or LSF is a set of tools for distributing jobs across a set of networked systems. If was developed initially by the Platform Computing company, which was acquired by IBM in 2012. It is now called IBM Spectrum LSF

Resources:

229 questions
1
vote
3 answers

How can I use the Platform LSF blaunch command to start processes simultaneously?

I'm having a hard time figuring out why I can't launch commands in parallel using the LSF blaunch command: for num in `seq 3`; do blaunch -u JobHost ./cmd_${num}.sh & done Error message: Oct 29 13:08:55 2011 18887 3 7.04 lsb_launch(): Failed while…
Zaid
  • 36,680
  • 16
  • 86
  • 155
1
vote
1 answer

How to convert a loop to a Job-array in LSF cluster

I have 100 files, and I want to parallelise my submission to save time instead of running jobs one by one. How can I change this script to a Job-array in LSF using bsub submission system and run 10 jobs at every time? #BSUB -J ExampleJob1 …
LDT
  • 2,856
  • 2
  • 15
  • 32
1
vote
0 answers

New to LSF systems, how to avoid load and showing results in short time using good output format? Where do I use bqueues -r command?

import commands import tempfile cluster = commands.getoutput('lsid | grep "My cluster name" |awk \'{print $5}\'') path = '/nxdi_env/lsf/conf/lsbatch/' + cluster + '/configdir/lsb.users' bl = commands.getoutput('sed -n \'/# BL Fairshare…
Ar8itrator
  • 23
  • 6
1
vote
1 answer

What does jug status 'Active' mean, and why does it not equal the number of procs requested?

I've been unable to find what status 'Active' tasks are. I'm using JUG 2.1.1, and I don't see that word appear anywhere in the manual, except in a footnote about 'active-wait'. I'm using an LSF array to run a large number (hundreds of thousands) of…
LGS
  • 110
  • 8
1
vote
0 answers

What are the pros/cons of 'bsub < script.sh'

Consider first script_0.sh: # script_0.sh foo bar baz I can run this script via LSF like this, for example: bsub -q myqueue -J myjob_0 -o path/to/log_0.out -e path/to/log_0.err -- /bin/sh ./script_0.sh Now consider a second, very similar…
kjo
  • 33,683
  • 52
  • 148
  • 265
1
vote
1 answer

UNIX and LINUX bsub command -W limit

I am trying to submit 10 jobs using bsub command on a specific location. $ bsub -q alloc -P acc_CLASSNAME\ > -J "Array_#4[1-10]"\ > -o "Output.%I" -n 1\ > -W 2:00 $HOME/bash/count.sh 1 when I run this, I am keep getting an error Run limit must…
Rivendel
  • 31
  • 4
1
vote
1 answer

Why is this this python Lark grammar so slow?

I'm trying to parse the output of "ypcat -k netgroup" The output looks like many lines of this format: group1 (host1,user1,domain1) (host2,user2,domain2) (host3,user3,domain3) ... or sometimes group2 groupa groupb groupc ... I first tried using…
Chuck Tung
  • 321
  • 2
  • 10
1
vote
2 answers

Print "Jobs Finished" only when all the bjobs are completed

I have a python script that submits multiple jobs using bjobs. Below is the code snippet for jobs in job_list: i=0 os.system("bsub -J JOB_{} jobs".format(str(i)) i+=1 I want to print "Finished runnning" only when all the jobs have…
Astro
  • 11
  • 5
1
vote
3 answers

How to fix error message in tcl script having command [exec bjobs] when no jobs are running?

when I am running a Tcl script that contains the following lines: set V [exec bjobs ] puts "bjobs= ${V}" When jobs are present it's working properly but, no jobs are running it is showing an error like this: No unfinished job found while…
1
vote
1 answer

Matlab is spawning way too many threads

So, I am running on a Linux cluster with lots of compute nodes to choose from. I get exclusive use of the node. Batch submissions. I am running into issues limiting the number of threads. I should mention I have a parfor loop. When I start matlab…
whoami
  • 85
  • 1
  • 7
1
vote
0 answers

request for clarification in snakemake's documentation regarding 'resources' and 'threads'

I have a question with regards to resources and threads (it's not clear to me from the documentation). Are the resources per thread ? That's the case with various HPC job submission systems. E.g.: that's for example how jobs work on LSF's bsub: If…
DrYak
  • 1,086
  • 1
  • 10
  • 15
1
vote
0 answers

How to extract with Python the list of ids of jobs running on an LSF cluster?

I am currently writing a python script to launch many simulations in parallel using this command repeatedly : os.system("bsub -q reg -app ... file.cir") And I need to retrieve the job ID list in order to know exactly when all the jobs are completed,…
1
vote
0 answers

can LSF be used to run each job on a seperate vm?

Based on documentation LSF is used mainly for HPC (High performance computing). Its resource connector (LSF resource connector) is meant to borrow VMs from resource provider (e.g AWS). This means LSF will launch an instance and run jobs that you…
1
vote
2 answers

How to use lsf.yaml with snakemake?

I would like to use snakemake with LSF. I follow this url . My Snakefile contain: rule all: input: "foo.txt", "file.out" rule foo: input: "foo.txt" output: "bar.txt" shell: "set +o pipefail; grep bar {input} > {output}…
user1980099
  • 573
  • 1
  • 8
  • 30
1
vote
0 answers

Retrieving the output of SSH when launched via LSF

I'm using localhost.run to open tunnels, so I want to launch ssh -R 80:localhost:8080 ssh.localhost.run The only problem is that I launch it via LSF as part of a bash script. I get the error Pseudo-terminal will not be allocated because stdin is…
Labo
  • 2,482
  • 2
  • 18
  • 38