Questions tagged [parallelism-amdahl]

Amdahl's law, also known as Amdahl's argument, is used to find the maximum expected improvement to an overall system when only part of the system is improved. It is often used in parallel computing to predict the theoretical maximum speedup using multiple processors. The law is named after computer architect Gene Amdahl, and was presented at the AFIPS Spring Joint Computer Conference in 1967.

Amdahl's law, also known as Amdahl's argument, is used to find the maximum expected improvement to an overall system when only part of the system is improved. It is often used in to predict the theoretical maximum speedup using multiple processors. The law is named after computer architect Gene Amdahl, and was presented at the AFIPS Spring Joint Computer Conference in 1967.

The speedup of a program using multiple processors in parallel computing is limited by the time needed for the sequential fraction of the program. For example, if a program needs 20 hours using a single processor core, and a particular portion of the program which takes one hour to execute cannot be parallelized, while the remaining 19 hours (95%) of execution time can be parallelized, then regardless of how many processors are devoted to a parallelized execution of this program, the minimum execution time cannot be less than that critical one hour. Hence the speedup is limited up to 20x.

106 questions
2
votes
4 answers

Optimisation tips to find in which triangle a point belongs

I'm actually having some troubles optimising my algorithm: I have a disk (centered in 0, with radius 1) filled with triangles (not necessarily of same area/length). There could be a HUGE amount of triangle (let's say from 1k to 300k triangles) My…
2
votes
2 answers

python joblib & random walk - a performance of [CONCURRENT]-process scheduling

Here is my python-3.6 code for simulating a 1D reflected random walk, using the joblib module to generate 400 realizations concurrently across K workers on a Linux cluster machine. I note, however, that the runtime for K=3 is worse than for K=1, and…
2
votes
2 answers

OpenMP worst performance with more threads (following openMP tutorials)

I'm starting to work with OpenMP and I follow these tutorials: OpenMP Tutorials I'm coding exactly what appears on the video, but instead of a better performance with more threads I get worse. I don't understand why. Here's my code: #include…
2
votes
1 answer

What is the max degree of parallelism for reading/writing files?

Let's say I have 100 text files file_0.txt file_1.txt . . . file_99.txt and I want to read from them as fast as possible. I'm a software developer and don't have a great background in hardware. So I'm wondering if the "max degree of parallelism"…
user7127000
  • 3,143
  • 6
  • 24
  • 41
2
votes
1 answer

Why does the get() operation in multiprocessing.Pool.map_async take so long?

import multiprocessing as mp import numpy as np pool = mp.Pool( processes = 4 ) inp = np.linspace( 0.01, 1.99, 100 ) result = pool.map_async( func, inp ) #Line1 ( func is some Python function which acts on input ) output = result.get() …
2
votes
2 answers

Why python multiprocessing takes more time than serial code? How to speedup this?

I was trying out the Python multiprocessing module. In the code below the serial execution time 0.09 seconds and the parallel execution time is 0.2 seconds. As I am getting no speedup, I think I might be going wrong somewhere import multiprocessing…
2
votes
1 answer

How to parallelize matrix sorting for loop?

I am trying to parallelize a for(){...} loop, using OpenMP, which takes a number of "lines" N of a "table" N*M and sorts each line in an ascending order. I added #pragma omp parallel, #pragma omp for schedule directives, but don't see any changes,…
2
votes
2 answers

Parallelizing for loop in Python

I coded a neural network that runs really slow, so I was hoping to speed it up a little by parallelizing a certain loop. I am not sure about the implementation and how the GIL works and if its relevant for me. The code looks like this: class…
Eumel
  • 1,298
  • 1
  • 9
  • 19
2
votes
1 answer

Why is a Compare-And-Swap operation limited by Amdahl's law?

Martin Thompson asserts that a STM that relies on a ref that relies on CAS will ultimately be limited by Amdahl's law. Amdahl's law being that the maximum performance of a parallel program is limited by the sequential (non-parallel) part of the…
1
vote
1 answer

Determining the Parallel and Serial Region of Code and Calculating Speedup using Amdahl's Law

I was trying to understand the working of Amdahl's law but got confused in the process. Consider the following problem: Supposea program has a part at the beginning that is sequential in nature (must be executed by only one processor) and takes 3…
malik727
  • 169
  • 8
1
vote
1 answer

Why my code runs so much slower with joblib.Parallel() than without?

I am new to use joblib.Parallel() to speed up some massive numpy.fft calculations. I follow this example presented on joblib-web Using the example, I can see following result on my computer: Elapsed time computing the average of couple of slices…
1
vote
3 answers

Trying to understand Amdahl's Law

I am trying to answer a school assignment but i am getting confused to what the question is trying to ask. A design optimization was applied to a computer system in order to increase the performance of a given execution mode by a factor of 10. The…
1
vote
2 answers

How to apply Amdahl's law on a given piece of code?

I have the following question in my assignment. I know that I need to use Amdahl's law but I don't know which part is going to be which part in the formula. Here is the question: How much will the following code speed up if we run it…
user16438868
1
vote
1 answer

How to measure OMP time with OMP tasks & recursive task workloads?

Below I try to sketch code that is parallelised using OpenMP tasks. In the main function a parallel environment is started and immediately after doing so the code is wrapped into a #pragma omp master section. After computing the expected workload…
1
vote
1 answer

Parallelization Python loop

I'm a bit lost between joblib, multiprocessing, etc.. What's the most effective way to parallelize a for loop, based on your experience? For example : for i, p in enumerate(patches[ss_idx]): bar.update(i+1) …