Questions tagged [lsf]

LSF, aka Load Sharing Facility is software that executes batch jobs on networked Unix and Windows systems on many different architectures. It is commonly used in HPC Clusters in many universities and research centers around the world.

The Load Sharing Facility or LSF is a set of tools for distributing jobs across a set of networked systems. If was developed initially by the Platform Computing company, which was acquired by IBM in 2012. It is now called IBM Spectrum LSF

Resources:

229 questions
0
votes
0 answers

How to specify the number of MPI ranks by means of environment variables?

Let's assume, I run my Open MPI application with the following command: mpirun a.out and I specify the number of MPI ranks by means of an LSF job scheduler script: #BSUB -n 20 How to specify the number of MPI ranks for mpirun through some Open MPI…
Alexander Pozdneev
  • 1,289
  • 1
  • 13
  • 31
0
votes
0 answers

LSF serial jobs on HPC performance worse than local sequential executions

I'm learning how to use HPC on our lab's clusters, which uses LSF. I tried a simple serial jobs each of which count the frequency of the words in a text file. I wrote a python code for counting the word frequency named count_word_freq.py, a jobs…
Alex Wang
  • 5
  • 1
  • 6
0
votes
0 answers

mpirun not passing standard input to fortran

Trying to run a bit of code on a new cluster, however I am having some issues passing some data through mpirun into the fortran code. For note this bit of code has worked on previous clusters but something seems to be different with this cluster.…
David Duncan
  • 81
  • 1
  • 11
0
votes
1 answer

Define maximum jobs in queue inside lsb.queues

I am trying to find out how many jobs a queue that I defined can hold. Which parameter holds the maximum number of jobs that can be submitted to a specific queue?
0
votes
0 answers

What really happens if LSF starts a single Python job on multiple nodes?

Using LSF, I have submitted a Python job using -n N where N>1. This means it will use multiple cores, which may or may not be on the same node. I have not written any explicit code for inter-process communication, but I do use libraries that can…
gerrit
  • 24,025
  • 17
  • 97
  • 170
0
votes
1 answer

Why do I have many more jobs `started` than running or suspended?

According to the bqueues manual page: STARTED Number of job slots used by running or suspended jobs owned by users or user groups in the queue. According to bqueues, I have 369 jobs started: $ bqueues -r lotus | egrep…
gerrit
  • 24,025
  • 17
  • 97
  • 170
0
votes
1 answer

LSF - BSUB Running a script if the job is killed

Im working with the LSF, running bsub commands. I'm implementing the -Ep switch to run a post exec script. This works great until the Job is killed or hits a memory limit, run limit etc. Is there any way for the job to detect its running out of…
0
votes
1 answer

Overwriting and errors running multiple instances of Python code

I'm a physics student trying to run a research-related simulation that has stochastic elements. The simulation can be split into several non-interacting parts, each part evolving randomly and so, no interaction between runs is required. I use a…
Asaf M
  • 21
  • 2
0
votes
0 answers

IBM HPC 4.2 different behavior between IBM MPI and OpenMPI under LSF

We have an IBM HPC 4.2 with 32 compute nodes. We did compile and install Openmpi 1.10.1 with lsf support. The problem : We have a different behavior between IBM MPI (MPI chipped with the platform or PMPI) and Openmpi when we use them under…
Wodel
  • 1
  • 3
0
votes
1 answer

BatchJobs results gives the function result * -1 + job#?

I am running a minimal example using BatchJobs, and the results are not as expected. I'm using the global_config settings, with debug=TRUE. I am running the following code in R on my HPC server: library(BatchJobs) reg <- makeRegistry(id =…
0
votes
1 answer

Pass a hash object from a perl script to through bsub (LSF)

This is an extension to my older question: Pass a hash object from one perl script to another using system. @Sobrique answered my question there. Now, I want to know if I can pass an hash object to another perl script in the same way as my question…
Komal Rathi
  • 4,164
  • 13
  • 60
  • 98
0
votes
1 answer

Maximum value for filesize limit in bsub (over lsf)

I was trying to set maximum filesize using bsub -F option. But there is no manual suggesting max value. Can some body please help in setting maximum value for filesize limit.
learner
  • 1,952
  • 7
  • 33
  • 62
0
votes
1 answer

Running a Database on LSF platform

I have to run some benchmarks on a a computer cluster which uses LSF as a platform to submit jobs. I need to run these Benchmarks on different databases. Some of them need to run a server before listening to connections from the client (Like…
Bafla13
  • 132
  • 2
  • 11
0
votes
1 answer

how to retrieve a job information LSF archive

We execute our job application through bsub command in Linux OS. when the job completes, what is the command to retrieve the job information from the LSF archive. i know there is command like bacct jobNo. But it does not retrieve the…
Schwab
  • 91
  • 1
  • 6
0
votes
1 answer

MPI+OpenMP job submission script on LSF

I am very new to LSF. I have 4 nodes with with 2 sockets per node. Each node is having 8 cores. I have developed hybrid MPI+OpenMP code. I am submitting the job like the following which asks each core to perform one MPI task. So I loose the power of…
Bhaiti
  • 1
  • 4