Questions tagged [openacc]

The OpenACC Application Program Interface describes a collection of compiler directives to specify loops and regions of code in standard C, C++ and Fortran to be offloaded from a host CPU to an attached accelerator, providing portability across operating systems, host CPUs and accelerators.

The OpenACC Application Program Interface describes a collection of compiler directives to specify loops and regions of code in standard C, C++ and Fortran to be offloaded from a host CPU to an attached accelerator, providing portability across operating systems, host CPUs and accelerators.

Useful Links

The OpenACC directives and programming model allow programmers to create high-level host+accelerator programs without the need to explicitly initialize the accelerator, manage data or program transfers between the host and accelerator, or initiate accelerator startup and shutdown.

All of these details are implicit in the programming model and are managed by the OpenACC API-enabled compilers and runtimes. The programming model allows the programmer to augment information available to the compilers, including specification of data local to an accelerator, guidance on mapping of loops onto an accelerator, and similar performance-related details.

How to get Useful Answers to your OpenACC Questions on StackOverflow

Here are a number of suggestions to users new to OpenACC and/or StackOverflow. Follow these suggestions before asking your question and you are much more likely to get a satisfactory answer!

  • Search StackOverflow (and the web!) for similar questions before asking yours
  • Include an as-simple-as-possible code example in your question and you are much more likely to get a useful answer. If the code is short and self-contained (so users can test it themselves), that is even better.
403 questions
3
votes
0 answers

Accelerate the use of MPI types

I have a Fortran program that uses MPI types to describe subarrays for data transfers, which prevents me to create send/receive buffers manually. This works nice, but when accelerating the program with OpenACC, the subarrays may be not contiguous in…
3
votes
1 answer

Loop carried dependence of `->` prevents parallelization

I have a Model class that holds data for a model and runs several functions on that data. The details are probably not too important except that it has the following design: Variables are stored in the class namespace. Variables are initialized and…
Richard
  • 56,349
  • 34
  • 180
  • 251
3
votes
1 answer

Can I use std::bitset's functions with OpenACC?

Is it possible to use bitset's functions in OpenACC region? An example code: #include #include #pragma acc routine seq int mystrcmp (const char *, const char *); int main(int argc, char** argv) { long sum = 3, i; …
fokhagyma
  • 73
  • 1
  • 7
3
votes
1 answer

OpenACC Library Interoperability: how to get device pointer?

We have a project that is written in Fortran. Now I know this can be done using PGI compilers but I don't want to get stuck with licenses. I am trying to see whether we could use OpenACC in our project. I got gcc5.2 installed using instructions…
Vikram
  • 308
  • 1
  • 5
3
votes
1 answer

Manual Deep Copy to Device in C

I am attempting to parallelize a program that does some image processing with openACC. As a part of this processing I have a custom structure defined similar to: typedef struct { RGB *image; double property; } Deep; Which I am accessing within…
challett
  • 906
  • 6
  • 16
3
votes
0 answers

Undefined symbol error in MEX when calling a routine from an PGCC-compiled OpenACC-accelerated shared library

I have a shared library libraberto.so compiled with PGCC. It contains OpenACC pragma directives and is compiled with the -acc flag to ensure these directives are enabled. The corresponding makefile rules are: libraberto.so: file1.o file2.o ... pgcc…
lodhb
  • 929
  • 2
  • 12
  • 29
3
votes
1 answer

Use of shared memory with OpenACC

I'm trying to use shared memory to cache things with OpenACC. Basically what I'm working on is a matrix multiplication, and what I have is this: typedef float ff; // Multiplies two square row-major matrices a and b, puts the result in c. void…
leo
  • 1,117
  • 1
  • 8
  • 18
2
votes
2 answers

OpenAcc program built with C++ compiler is way slower than C built version

The code I'm working on is in C++ and is slightly complicated but the the example below shows the problem. It comes from a book by Chandrasekaran and Juckeland. If it is compiled with nvc -acc (or pgcc -acc, as the authors did) and ran, it takes a…
paww
  • 23
  • 3
2
votes
0 answers

How to read data using GPU with an OpenACC Fortran code?

I am trying to upgrade some legacy f77 codes to analyse molecular dynamics trajectories. I have had some success with modern fortran with OpenACC directives. My aim now is to read the LAMMPS trajectory files which are usually big, about a few GBs,…
Gyrtle
  • 31
  • 5
2
votes
1 answer

How to apply a reduce OpenACC directive to a multidimensional vector?

I'm trying to paralelize some code with OpenACC. #pragma acc parallel loop reduction (+:matriz()) for(auto i = 0; i <= (width-siz); i += siz) for(auto j = 0; j <= (width-siz); j += siz) for(auto k = 0; k…
2
votes
0 answers

Error: cannot compute suffix of object files: cannot compile while building GCC 10 with offloading OpenACC Support to Nvidia GPU

I am trying to install GCC 10 for Nvidia PTX on Ubuntu 20.04 so that I can offload the openACC loads to Nvidia GPU. I am following the steps given in this link Installing nvptx-tools git clone…
2
votes
1 answer

Compiling c++ OpenACC parallel CPU code using GCC (G++)

When trying to compile OpenACC code with GCC-9.3.0 (g++) configured with --enable-languages=c,c++,lto --disable-multilib the following code does not use multiple cores, whereas if the same code is compiled with the pgc++ compiler it does use…
Lewis0112
  • 23
  • 3
2
votes
1 answer

Calling Fortran OpenACC from CUDA file. How to compile with PGI?

I have a CUDA code in which I would like to include external code that consists of Fortran with OpenACC kernels. I have two files with the following content inspired on a discussion on the NVIDIA website. File main.cu is the following: #include…
Chiel
  • 6,006
  • 2
  • 32
  • 57
2
votes
1 answer

How to perform manual deep copy of 2D dynamic array of struct in C using OpenACC

I am trying to modify an existing particle method code using OpenACC to run on GPU. The existing code utilizes a 2D dynamic array of struct in c. I need to copy the structure(s) to GPU for further calculation. A code sample is given below: typedef…
Ali Imran
  • 27
  • 4
2
votes
1 answer

Portable random number generation with OpenACC

Is there a portable way to generate random numbers with OpenACC? I know that it is possible to directly use cuRand but then I am restricted to Nvidia GPUs. Another option seems to be generating numbers on the host and then moving them to the device,…
Fabian
  • 547
  • 1
  • 4
  • 17
1
2
3
26 27