Questions tagged [openmp]

OpenMP is a cross-platform multi-threading API which allows fine-grained task parallelization and synchronization using special compiler directives.

OpenMP is a cross-platform multi-threading API which allows fine-grained task parallelization and synchronization using special compiler directives. OpenMP offers easy access to multi-threading without requiring knowledge of system-dependent details. At the same time, it is reasonably efficient compared to fine-tuned implementations with the bonus of being easiest to write multi-threads code. Forums and complete information on OpenMP is at https://openmp.org/.

OpenMP is based on multi-thread model, and offers Shared Memory parallelism and heterogeneous programming for coprocessors through compiler directives, library routines and environment variables. It is restricted to C/C++ and Fortran applications, however provides portability across different Shared Memory architectures.

It is through directives, added by the programmer to the code, that the compiler adds parallelism in the application. OpenMP can be used in single or multi-cores machines, in the first architecture the compiler directives are ignored, thus the application is executed in a sequential manner, promoting portability between the two architectures.

Latest version is 5.2 (November 2021): Official OpenMP specifications.

Definitive Book Guide

Helpful links

6462 questions
2
votes
2 answers

Proper way to use omp.h in macOS

When trying to compile ASIFT algorithm from C++ source on macOS, I encountered problems with OpenMP library. The compiler is Apple Clang, macOS version is 11.3. First the compiler told me that "omp.h" can't be found. I refer to this question and…
Lang Zhou
  • 53
  • 1
  • 3
  • 7
2
votes
2 answers

Detecting race conditions between OpenMP threads/CUDA streams

I am getting wrong numerical results from an application parallelized with OpenMP. Each OpenMP thread runs one or more streams on an NVIDIA GPU. I suspect that there is a race condition between OpenMP threads or CUDA streams while updating…
Kadir
  • 1,345
  • 3
  • 15
  • 25
2
votes
1 answer

Implicit private control loop variable

I have a doubt and I did not find the correct response in the OpenMP documentation. If I have a loop like this: int i; #pragma omp parallel for for(i=0;i<10;i++) //do some stuff Is the variable i implicit private, am I right? Or i have to…
Fabio
  • 336
  • 3
  • 17
2
votes
1 answer

Calling parallel C++ code in Python using Pybind11

I have a C++ code that runs in parallel with OpenMP, performing some long calculations. This part works great. Now, I'm using Python to make a GUI around this code. So, I'd like to call my C++ code inside my python program. For that, I use Pybind11…
Naomi
  • 45
  • 6
2
votes
1 answer

Adding numbers from 1 to 100 OpenMP

I'm trying to get the sum of numbers from 1 to 100 using only 5 threads even though I have 12 available. This was my approach. Please show me where I went wrong. #include #include #include int main (int argc, char…
Ulquiorra
  • 45
  • 2
  • 7
2
votes
1 answer

Efficient Parallel algorithm for array filtering

Given a very large array I want to select only the elements that match some condition. I know a priori the number of elements that will be matched. My current pseucode is: filter(list): out = list of predetermined size i = 0 for element in…
2
votes
1 answer

Is it possible to initialize a vector with openMP with O(1) complexity? (C++)

I'm trying to parallelize some vector functions in a struct using openMP. While it works well with most of my implementations, I find that since the constructor for std::vector<> has a linear complexity, I can't get better performance and instead…
Michael B
  • 53
  • 6
2
votes
1 answer

Is there a way to parallelize a lower triangle matrix solver?

The goal is to add OpenMP parallelization to for (i = 0; i < n; i++) for the lower triangle solver for the form Ax=b. Expected result is exactly same as the result when there is NO parallelization added to for (i = 0; i < n;…
anon
2
votes
1 answer

OpenMP SIMD reduction in array: "error: reduction variable must be shared on entry to this OpenMP pragma"

I am trying to compute the average value over adjacent elements within a matrix, but am stuck getting OpenMP's vectorization to work. As I understand the second nested for-loop, the reduction clause should ensure that no race conditions occur when…
St123
  • 310
  • 1
  • 9
2
votes
2 answers

Reducing the max value and saving its index

int v[10] = {2,9,1,3,5,7,1,2,0,0}; int maximo = 0; int b = 0; int i; #pragma omp parallel for shared(v) private(i) reduction(max:maximo) for(i = 0; i< 10; i++){ if (v[i] > maximo) maximo = v[i]; b = i + 100; } How can I get the…
iMAOs
  • 31
  • 2
2
votes
2 answers

"fatal error: 'omp.h' file not found" using clang on Apple M1

Clang isn't able to find omp.h whenever I try to compile with openMP flag. Here's what I'm trying to do clang++ -dynamiclib -I/opt/homebrew/Cellar/eigen/3.3.9/include/eigen3/ -Xpreprocessor -fopenmp -o libfoo.dylib foolibrary.cpp -lomp Although I…
2
votes
1 answer

Is running with different cores different from running with different thread in OpenMP?

To execute a section of a code in parallel using a known number of thread, we usually do this: #pragma omp parallel num_threads(8) {} However, how can we set number of cores instead of thread? Are these different?
MA19
  • 510
  • 3
  • 15
2
votes
1 answer

Using FFTW in OOP with multithreading

Consider a container for two-dimensional complex-valued arrays #include #include struct Array2D { typedef std::array complex; int X, Y; std::vector values; }; and a function that uses FFTW to compute the…
user15378908
2
votes
3 answers

What is the better implementation strategy?

This question is about the best strategy for implementing the following simulation in C++. I'm trying to make a simulation as a part of a physics research project, which basically tracks the dynamics of a chain of nodes in space. Each node contains…
jonalm
  • 935
  • 2
  • 11
  • 21
2
votes
1 answer

Differences between `#pragma parallel for collapse` and `#pragma omp parallel for`

Firstly, the question might be slightly misleading, I understand the main differences between the collapse clause in a parallel region and a region without one. Let's say I want to transpose a matrix and there are the following two methods, First a…
Atharva Dubey
  • 832
  • 1
  • 8
  • 25