Questions tagged [hpc]

High Performance Computing (HPC) refers to the use of supercomputers and computer clusters to solve a wide range of computationally intensive problems.

Systems with benchmark performance of 100s of teraflops are usually considered to be supercomputers. A typical feature of these supercomputers is that they have a large number of computing nodes, typically in the range of O(10^3) to O(10^6)). This distinguishes them from small-to-midsize computing clusters, which usually have O(10) to O(10^2) nodes.

When writing software that aims to make effective use of these resources, a number of challenges arise that are usually not present when working on single-core systems or even small clusters:


Higher degree of parallelization required

According to the original Sep-1966 formulation of the "classical" Law of diminishing returns -aka- Amdahl's Law, the maximum speedup one can achieve using parallel computers is restricted by the fraction of serial processes in your code (i.e. parts that can not be parallelized). That means the more processors you have, the better your parallelization concept has to be. The contemporary re-formulation, not ignoring add-on costs for process-spawning overheads, parameter / results SER/DES costs and the add-on costs of communications and last but not least the facts of resources-respecting, atomicity-of-work effects in the re-formulated overhead-strict revised Amdahl's Law, the add-on costs-adjusted comparisons more closely reflect the actual net-speedup benefits of True-[PARALLEL] code-execution(s), not ignoring the respective classes of add-on costs, related to the way, how such sections become prepared & executed.


Specialized hardware and software

Most supercomputers are custom-built and use specialized components for hardware and/or software, i.e. you have to learn a lot about new types of architectures if you want to get maximum performance. Typical examples are the network hardware, the file system, or the available compilers (including compiler optimization options).


Parallel file I/O becomes a serious bottleneck

Good parallel file systems handle multiple requests in parallel rather well. However, there is a limit to it, and most file systems do not support the simultaneous access of thousands of processes. Thus reading/writing to a single file internally becomes serialized again, even if you are using parallel I/O concepts such as MPI I/O.


Debugging massively parallel applications is a pain

If you have a problem in your code that only appears when you run it with a certain number of processes, debugging can become very cumbersome, especially if you are not sure where exactly the problem arises. Examples for process number-dependent problems are domain decomposition or the establishment of communication patterns.


Load balancing and communication patterns matter (even more)

This is similar to the first point. Assume that one of your computing nodes takes a little bit longer (e.g. one millisecond) to reach a certain point where all processes have to be synchronized. If you have 101 nodes, you only waste 100 * 1 millisecond = 0.1 s of computational time. However, if you have 100,001 nodes, you already waste 100 s. If this happens repeatedly (e.g. every iteration of a big loop) and if you have a lot of iterations, using more processors soon becomes non-economical.


Last but not least, the power

Thermal ceilings and power-"capping"-strategies are another dimension in fine-tuning the arena. End-to-end performance rules. The thermal-constrained and/or power-capping limitation pose another set of parameters, that decide on how to efficiently compute HPC-workloads withing the time- and capped-electric-power-constrained physical HPC-computing infrastructure. Because of many-fold differences, the scenarios do not obey an easily comprehensible choice, mostly being the very contrary ( contra-intuitive as per what is the optimum thermal- and power-capping configuration of the HPC-workload distribution over the computing infrastructure ), repeated workloads typically adapt these settings, as experience is being gathered ( like in weather-modelling ), as no sufficiently extensive ( so as to become decisive ) prior-testing was possible.

1502 questions
-2
votes
1 answer

How to change a dynamic script into static script in bash?

I am new to bash. I am working on a project on a HPC sever, and having some trouble in changing a dynamic file into static. I have created an alias "submitjob" Its tasks are: take a user input "job_ID" create a copy of generic slurm script -…
-2
votes
1 answer

process to process computation and communication in mpi

P1 P2 P3 P4 1 2 3 4 5 6 7 8 1 2 3 4 0 6 0 8 Suppose P1,P2,P3,P4 are processess and P1 has data points 1 2 5 6 , P2 has data points 3 4 7 8 P3 has data points 1 2 0 6 , P4 has datapoints 3 4 0 8. I want to peform stecil computation on this piece of…
user15240025
-2
votes
1 answer

Running parallel c++ code in Google cloud platform

I have a C++ code that need to be run independently 100,000 times, each with a different set of arguments. One single run takes around 20 minutes on a small laptop. I would like to parallelize the execution using a cloud infrastructure like GCP.…
anupgp
  • 1
  • 2
-2
votes
1 answer

how to ensure multiprocessing code using the configured cpu cores?

I use multiprocessing Pool to run parallel. I tried with 4 cores first in HPC with sub. When it uses 4 core, the time is reduced 4 times compared to 1 core. When I check with qstat, several times it uses 4 cores but after that just 1 core, with…
pughon
  • 19
  • 5
-2
votes
2 answers

How To Learn HPC?

So I wanted to learn HPC and I couldn't find any resources list. Of course we have "Awesome HPC" but last update was for 3 years ago. My main question is how to learn HPC. what are the prerequisites? what programming languages should I know? And if…
-2
votes
1 answer

fins and move files with path name with python3

I am trying recreate python script from my perl script to find all files with common name model1_r.pdb and them move it to a new folder with newname of their previous existing folder. This is the python code I wrote; import os, shutil, glob # This…
Kay
  • 90
  • 8
-2
votes
1 answer

HPC cluster extremely slow

I have set up a small cluster with 1 head node and 3 compute nodes. My client machine is a Windows 2016 Server which I use to submit Workbook offloading jobs. My problem - the HPC is extremely slow; if I run the job on my local machine, it runs…
KMLN
  • 79
  • 2
  • 3
  • 14
-2
votes
1 answer

How to run binary executables in multi-thread HPC cluster?

I have this tool called cgatools from complete genomics (http://cgatools.sourceforge.net/docs/1.8.0/). I need to run some genome analyses in High-Performance Computing Cluster. I tried to run the job allocating more than 50 cores and 250gb memory,…
MAPK
  • 5,635
  • 4
  • 37
  • 88
-2
votes
1 answer

Convolution with CUDA C, error: expression must be a modifiable lvalue

__global__ void conv(const float *a, const float *a1, const size_t n) { // compute the global element index this thread should process unsigned int i = threadIdx.x + blockDim.x * blockIdx.x; unsigned int j = threadIdx.y +…
braigns10
  • 1
  • 1
-2
votes
1 answer

How to profile number of function calls and wall clock time using HPCToolkit?

I intend to profile the Community Earth System Model (CESM) on a cluster of 8 nodes. I am able to successfully profile the application using HPCToolkit I am able to get only two metrics being CPU Time(I) and CPU Time(E). I am interested in getting…
-3
votes
1 answer

Why does this command for qsub to submit multiple pbs scripts work in the bash shell but not fish?

I have a bunch of .pbs files in one directory. I can qsub the files no problem with this command in the bash shell but for the fish shell, I continuously hit enter and it just creates a new input line. Any ideas why it doesn't work in fish? for…
Kevin
  • 75
  • 3
-3
votes
2 answers

Submitting a matlab serial job to HPC

I want to submit serial matlab script on HPC server using: Code: #!/bin/bash #$ -N matlabjob #$ -q all.q #$ -pe mpi 1 /opt/matlab/bin/matlab -nodesktop -nosplash -r "run /home/abhishekb/Matlab/new.m;quit" > out.txt Error: License checkout…
Abhishek Bhatia
  • 9,404
  • 26
  • 87
  • 142
-3
votes
1 answer

Data transfer between two remote servers linux (sftp)

I work remotely trough my linux system. I have data on server A which can be accessed only through SFTP. The second server is HPC cluster. I want to fetch data in my HPN cluster working directory from server A. How can it be done? Is there any…
-4
votes
1 answer

Explain how this Sorting algorithm works?

Can someone please explain how this sorting algorithm works? It's called "sequential_sort" void sequential_sort(std::vector& X) { unsigned int i, j, count, N = X.size(); std::vector tmp(N); for (i = 0; i <…
Nina
  • 1
-4
votes
1 answer

Parallel Computing with Scilab or Octave

I have a large set of data to process [40000x50] values. I use Matlab on my laptop but it takes a very long time. Recently I had an access to an HPC station with theoretically I can process parallel computing. So how can I do that? I think I can't…
Don don
  • 3
  • 5
1 2 3
99
100