Questions tagged [openmp]

OpenMP is a cross-platform multi-threading API which allows fine-grained task parallelization and synchronization using special compiler directives.

OpenMP is a cross-platform multi-threading API which allows fine-grained task parallelization and synchronization using special compiler directives. OpenMP offers easy access to multi-threading without requiring knowledge of system-dependent details. At the same time, it is reasonably efficient compared to fine-tuned implementations with the bonus of being easiest to write multi-threads code. Forums and complete information on OpenMP is at https://openmp.org/.

OpenMP is based on multi-thread model, and offers Shared Memory parallelism and heterogeneous programming for coprocessors through compiler directives, library routines and environment variables. It is restricted to C/C++ and Fortran applications, however provides portability across different Shared Memory architectures.

It is through directives, added by the programmer to the code, that the compiler adds parallelism in the application. OpenMP can be used in single or multi-cores machines, in the first architecture the compiler directives are ignored, thus the application is executed in a sequential manner, promoting portability between the two architectures.

Latest version is 5.2 (November 2021): Official OpenMP specifications.

Definitive Book Guide

Using OpenMP: Portable Shared Memory Parallel Programming - Barbara Chapman et al.
Using OpenMP - The Next Step: Affinity, Accelerators, Tasking, and SIMD - Ruud van der Pas et al.
Parallel Programming in OpenMP - Rohit Chandra.
An Introduction to Parallel Programming - Peter Pacheco.
Parallel Programming in C with MPI and OpenMP - Michael J. Quinn.

Helpful links

6462 questions

votes

1 answer

Numbers not randomized after runs

I'm trying to create an openMP program that randomizes double arrays and run the values through the formula: y[i] = (a[i] * b[i]) + c[i] + (d[i] * e[i]) + (f[i] / 2); If I run the program multiple times I've realised that the Y[] values are the same…

c openmp

asked Apr 15 '22 at 04:25

Ibrahim

votes

0 answers

Templating and OpenMP causes free(): double free detected in tcache 2

I've worked for a while to get my code to a minimal reproducible example and I think I have it. See the single main.cpp function below, compiled (on Linux) one of two ways: In serial: g++ -O3 --std=c++17 -o test_rho.exe main.cpp With OpenMP: g++…

c++ templates openmp stdvector free

asked Apr 09 '22 at 17:06

drjrm3

4,474
10
53
91

votes

1 answer

OpenMP parallel loop much slower than regular loop

The whole program has been shrunk to a simple test: const int loops = 1e10; int j[4] = { 1, 2, 3, 4 }; time_t time = std::time(nullptr); for (int i = 0; i < loops; i++) j[i % 4] += 2; std::cout << std::time(nullptr) - time <<…

c++ performance parallel-processing openmp

asked Apr 06 '22 at 13:21

Kaiyakha

1,463
1
6
19

votes

1 answer

Faulty benchmark, puzzling assembly

Assembly novice here. I've written a benchmark to measure the floating-point performance of a machine in computing a transposed matrix-tensor product. Given my machine with 32GiB RAM (bandwidth ~37GiB/s) and Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz…

assembly openmp performance-testing icc microbenchmark

asked Mar 25 '22 at 13:53

Nitin Malapally

votes

1 answer

First touch in case of small sized data sharing on Linux

The "first touch" (a special term used to indicate virtual memory mapping in case of NUMA systems) write-operation causes the mapping of memory pages to the NUMA node associated with the thread which first writes to them. Having read this page,…

multithreading linux-kernel openmp shared-memory numa

asked Mar 22 '22 at 11:01

Nitin Malapally

votes

1 answer

How do OpenMP thread ids work with recursion?

Here is a simple recursive program that splits into two for every recursive call. As expected, the result is 2 + 4 + 8 calls to rec, but the number of threads is always the same: two, and the ids bounce back and forth between 0 and one. I expected…

c recursion openmp

asked Mar 16 '22 at 22:20

t-taketsune

votes

2 answers

Increasing array index in openMP

I am new to using OpenMP. I am trying to parallelize a nested loop, and so far I have something of this form... #pragma omp parallel for for (j=0;j

c++ c parallel-processing openmp openmpi

asked Mar 08 '22 at 21:50

S2022

votes

3 answers

Problem of sorting OpenMP threads into NUMA nodes by experiment

I'm attempting to create a std::vector> with one set for each NUMA-node, containing the thread-ids obtained using omp_get_thread_num(). Topo: Idea: Create data which is larger than L3 cache, set first touch using thread 0, perform…

c++ multithreading openmp affinity numa

asked Mar 03 '22 at 16:50

Nitin Malapally

votes

3 answers

How to optimize omp parallelization when batching

I am generating class Objects and putting them into std::vector. Before adding, I need to check if they intersect with the already generated objects. As I plan to have millions of them, I need to parallelize this function as it takes a lot of time…

c++ performance openmp

asked Mar 02 '22 at 13:26

Alexander S

votes

1 answer

Clang + OpenMP inefficient loop invariants

I came across some inefficient code generation by Clang while answering a different question (How do i parallelize this code using openmp with reduction) Let's consider this simple code: void scale(float* inout, ptrdiff_t n, ptrdiff_t m, ptrdiff_t…

c++ optimization clang openmp

asked Feb 28 '22 at 12:16

Homer512

9,144
2
8
25

votes

1 answer

Installing OpenMP on Mac m1. 'clang: error: unsupported option '-fopenmp'' when running a setup.py

I am using the Macbook pro M1. I have a python package that I am trying to install which is compiling c files and has the setup.py file as sources = ['*.c'], include_dirs=['##Directory Name##'], …

compiler-errors clang openmp apple-m1

asked Feb 27 '22 at 11:39

Saram Abbas

votes

1 answer

How can I realize data local spawning or scheduling of tasks in OpenMP on NUMA CPUs?

I have this simple self-contained example of a very rudimentary 2 dimensional stencil application using OpenMP tasks on dynamic arrays to represent an issue that I am having on a problem that is less of a toy problem. There are 2 update steps in…

c++ task openmp numa

asked Feb 27 '22 at 10:42

user151387

votes

0 answers

OpenMP + Fortran on Apple M1 is slower than MPI+Fortran

I have a new MacBook pro with the Apple M1 Max processor (10 cores total), running OS 12.2.1. I used Homebrew to install gcc: ~/homebrew/bin/gcc-11 --version gcc-11 (Homebrew GCC 11.2.0_3) 11.2.0 Copyright (C) 2021 Free Software Foundation,…

fortran mpi openmp gfortran apple-m1

asked Feb 23 '22 at 15:53

Jerome Orosz

votes

1 answer

Question to ARB on target construct limitations

in a research project we are developing a special-purpose floating-point accelerator. In this context, our original vision was to have a kind of "two-stage" or "nested" offload from ARM host -> RISCV-managed accelerator cluster -> actual…

openmp

asked Feb 21 '22 at 09:48

Kai Plociennik

votes

0 answers

Eigen matrix multiply triggers code to link to libomp.dll; it is already linked to libiomp5md.lib/dll. How do I stop this spurious linkage

A short complex piece of code is built with cmake; it links to Lapack(MKL), gmp/mpir, boost,.. and eigen3 (the versions are from vcpckg and the version of eigen is eigen3:x64-windows 3.3.9#1 ). I am currently testing it on Visual Studio 2019. At…

c++ cmake openmp eigen3

asked Feb 13 '22 at 18:05

tjl

Prev 1 2 3

…

99 100 Next