Questions tagged [supercomputers]

Supercomputers belong to a class of highly specialised hardware infrastructures, where high number of machines are typically pre-organised and smart-linked together with specialised high-speed low-latency interconnects, so as to allow new forms of concurrent processing cooperations to be orchestrated. Having any such supercomputing infrastructure is not enough, it is important to also use system tools capable to harness the most of the available CPU-powers

Supercomputers first began to appear in the 1960's.

These early supercomputers had only a single, high-speed processor. Control Data Corporation's CDC-6600, designed by Seymour Cray, was about ten times faster than all other computers of its day, and was dubbed a supercomputer -- the first appearance of the term.

Later, as processing speed, cooling ability, and physical size hit limits, Cray pioneered the method of linking multiple processors together in order to get more speed out of the same machine. This is the same method used in today's supercomputers, which can range in size from thousands of processing cores to hundreds of thousands of processing cores.

*  Seymour CRAY (                           yes, the supercomputer guy )
*  said:
*  --------------------------------------------------------------------
*  A supercomputer turns compute-bound problems into I/O bound problems
*  --------------------------------------------------------------------
*  and:
*  --------------------------------------------------------------------
*  It is not hard to build a fast processor or a fast memory,
*  but the challenge is to build a fast system.
*  --------------------------------------------------------------------

Interconnect latency is an additional [TIME]-domain penalty, each process has to pay for using a supercomputer's remote resource under a distributed computation-graph schedule.

Minimising interconnect's latency-costs is thus one natural direction, using a smarter, overhead-aware computation-graph design is the other direction to achieve the indeed I/O-bounds' bleeding edge of the ultimate performance from any supercomputing system's infrastructure.

enter image description here

91 questions
0
votes
3 answers

Is there a simple way to find out the power of cluster/node/supercomputer?

I know there are some unix utils for simple architecture queries: arch nproc lsb_release -a are there any simple ways to find out about the cluster/supercomputer/nodes - like to find out the number of teraflops of the machine and so on?
Adobe
  • 12,967
  • 10
  • 85
  • 126
0
votes
1 answer

C vs Fortran for BLAS 2

I have an application in which I need to carry out a lot of Norms, Dot Products and most importantly, Matrix Vector multiplications. matrix and vectors are huge. Matrix dimension is tending to be a 100000x100000 the loop structure…
user1132648
0
votes
0 answers

Slurm Exit Code 9: too much time between signals elapsed (32 seconds) - job killed

I am bootstrapping a panel of 2.7M using reghdfe and ppmlhdfe. I am using the Picotte cluster at Drexel, as this is computationally infeasible otherwise. When I run this, my job is killed because 1 of the iterations to compute an estimator takes…
0
votes
1 answer

How to include aws credentials and configs while submitting PBS script jobs?

How do you load the aws credentials and config file in PBS scripts(.pbs) file? I want to submit a job that includes file transfer from a remote server to s3 buckets, but getting endpoint url error. Also, the s3 bucket is not public. Any insights…
0
votes
0 answers

How to apply the nrpe installation on compute nodes using master node chroot path with xcat?

We have a cluster having the provision XCAT. I want to install the nrpe addon provided by the Nagios on the compute nodes in the cluster. I dont have much knowledge of xcat as i am a newbie, but as far as my knowledge to facilitate this we can…
0
votes
0 answers

How to upgrade a python library in a supercomputer

I am using Tinaroo (University of Queensland) super computer When I call to run my code using qsub 70my_01_140239.sh I get this error autosklearn.util.dependencies.IncorrectPackageVersionError: found 'dask' version 2021.11.2 but requires dask…
asmgx
  • 7,328
  • 15
  • 82
  • 143
0
votes
0 answers

why is Rpeak different from Rmax when measuring performance?

Rmax is maximum performance RPeak is theorotical maximum performance. but why can't supercomputers reach Rpeak. what causes the inefficency? an explanation to the cause of inefficency.
mTarifi4
  • 1
  • 1
0
votes
0 answers

configure: error: C compiler cannot create executables while installing mpich with brew on Conda Environment

I was trying to install mpich onto a Conda Environment which I made on a SuperComputer cluster. I do not have Sudo permissions and I don't have access to a desktop-like computer OS. What should I do to get rid of this error? I don't know what Linux…
0
votes
0 answers

Permissions error when installing spglib library on supercomputer

Relatively newer Linux user here, currently trying to install a library called spglib. Following some directions I found online, I have installed cmake, I unzip the install folder, then I perform: cd spglib mkdir _build cd _build cmake .. make make…
FrankC
  • 1
  • 1
0
votes
1 answer

Script is not working with high performance computer

I am using Tinaroo (a high-performance computer in University of Queensland) I built a demo python code (demo1.py), simple that creates a file that has the time of the execution of the code. import datetime StartTime =…
asmgx
  • 7,328
  • 15
  • 82
  • 143
0
votes
0 answers

What is the distribution of computational power in the world currently?

I am trying to get a sense of how centralized computational power is in the world, e.g. are there a few big companies and organizations that own majority of computational power in the world? For the sake of this questions, let's define computational…
Kaveh
  • 466
  • 1
  • 7
  • 21
0
votes
0 answers

Adding HPC Cluster nodes to a Kubernetes env running on local VM/host

I would like to run Kubernetes on a local VM/host and add compute nodes allocated via slurm job allocation on the remote supercomputer. The compute nodes on the remote supercomputer are accessible on the local host by first login to the…
Amit Ruhela
  • 311
  • 2
  • 8
0
votes
0 answers

python script on Google Cloud Platform still slow

thanks in advance for your patience - I'm not a programmer, but a researcher. I have a model worked up in a short python script that is computationally intensive. A time horizon input of 3 takes about 2 minutes on my 2020 MacBook Pro, but I need it…
0
votes
0 answers

data exchange between multiple ranks with MPI_Bsend

I would like to ask a few questions (mostly question 2.) for the code below whose purpose is to send data to an arbitrary number of 'target' ranks then receive other data (of different length) from all of the targets, i.e. exchange data with all…
Halbux
  • 21
  • 5
0
votes
0 answers

How to properly calculate CPU and GPU FLOPS performance?

Problem I'm trying to calculate CPU / GPU FLOPS performance but I'm not sure if I'm doing it correctly. Let's say we have: A Kaby Lake CPU (clock: 2.8 GHz, cores: 4, threads: 8) A Pascal GPU (clock: 1.3 GHz, cores: 768). This Wiki page says that…