Questions tagged [openacc]

The OpenACC Application Program Interface describes a collection of compiler directives to specify loops and regions of code in standard C, C++ and Fortran to be offloaded from a host CPU to an attached accelerator, providing portability across operating systems, host CPUs and accelerators.

The OpenACC Application Program Interface describes a collection of compiler directives to specify loops and regions of code in standard C, C++ and Fortran to be offloaded from a host CPU to an attached accelerator, providing portability across operating systems, host CPUs and accelerators.

Useful Links

The OpenACC directives and programming model allow programmers to create high-level host+accelerator programs without the need to explicitly initialize the accelerator, manage data or program transfers between the host and accelerator, or initiate accelerator startup and shutdown.

All of these details are implicit in the programming model and are managed by the OpenACC API-enabled compilers and runtimes. The programming model allows the programmer to augment information available to the compilers, including specification of data local to an accelerator, guidance on mapping of loops onto an accelerator, and similar performance-related details.

How to get Useful Answers to your OpenACC Questions on StackOverflow

Here are a number of suggestions to users new to OpenACC and/or StackOverflow. Follow these suggestions before asking your question and you are much more likely to get a satisfactory answer!

  • Search StackOverflow (and the web!) for similar questions before asking yours
  • Include an as-simple-as-possible code example in your question and you are much more likely to get a useful answer. If the code is short and self-contained (so users can test it themselves), that is even better.
403 questions
0
votes
2 answers

OpenACC must have routine information error

I am trying to parallelize a simple mandelbrot c program, yet I get this error that has to do with not including acc routine information. Also, I am not sure whether I should be copying data in and out of the parallel section. PS I am relatively new…
0
votes
1 answer

OpenACC - Nested loop strange behaviour

I'm working on LU decomposition of block diagonal matrices using OpenACC. I get the correct decomposition when I run my code sequentially, whilst when executing it under OpecACC directives I get wrong result when conducting the decomposition. LU…
Random Tourist
  • 151
  • 1
  • 1
  • 12
0
votes
1 answer

OpenACC - Private 2D array

I'm dealing with block diagonal matrices (each blocks has the same size) and I have an illegal address error when I use private dynamically allocated 2D array... // NB is the number of block // N is the block size // A is the main matrix (block…
Random Tourist
  • 151
  • 1
  • 1
  • 12
0
votes
0 answers

Collapse phrase on loop directive of OpenACC works occasionally. Why?

I have an MD code that has 4 nested loops. Assuming that the loops are linear, the complexity of the code is O(N^4). Using OpenACC, potentially I can say to the compiler that collapse all of them into one using collapse(4) in my outer loop. However,…
mgNobody
  • 738
  • 7
  • 23
0
votes
1 answer

Newbie OpenACC issue with CYCLE instruction in Fortran

quite newbie with OpenACC here, so please be patient :-) I'm trying to port some Fortran code to use OpenACC, and I'm finding a strange (at least to me) behaviour. The code is given below, but as you can see is just some nested loops which…
Angel de Vicente
  • 1,928
  • 3
  • 12
  • 16
0
votes
2 answers

-ta=tesla:managed:cuda8 but cuMemAllocManaged returned error 2: Out of memory

I'm new to OpenACC. I like it very much so far as I'm familiar with OpenMP. I have 2 1080Ti cards each with 9GB and I've 128GB of RAM. I'm trying a very basic test to allocate an array, initialize it, then sum it up in parallel. This works for 8 GB…
Matt Dowle
  • 58,872
  • 22
  • 166
  • 224
0
votes
0 answers

finding the minimum value takes too much memory operations on GPU

I have 9 2D arrays, and I want to calculate the minimum value of each element of the 8 arrays and store them in the 9th. I have wrote a code but it's taking much memory operations. My GPU is Nvidia GTX 1060 and I am using OpenACC to…
Ibrahim
  • 152
  • 2
  • 9
0
votes
1 answer

Floating point exception in OpenAcc Merge Sort program

#include #include #include #include #include #define THR 10 //Function to test if the output is in asending order or not void test(int a[], int n) { int i; for (i=1;i
0
votes
1 answer

OpenACC - Sparse Matrix Library

I'm using OpenACC for sparse matrix computation in C++. I need to use matrix operations within OpenACC regions. Are there any sparse matrix libraries compatible with OpenACC? I'm used to Eigen but it seems that it isn't compatible with OpenACC…
Random Tourist
  • 151
  • 1
  • 1
  • 12
0
votes
1 answer

OpenACC vs C++: FATAL ERROR: variable is partially present on the device

I'm trying to port some C++ application to GPU using OpenACC. As one could expect, the C++ code has a lot of encapsulation and abstraction. Memory is allocated in some vector-like class, then this class gets reused in many other classes around the…
Nikolai
  • 1,499
  • 12
  • 24
0
votes
0 answers

GCC compile error

I am trying to build and run GCC. After trying to build it a couple different ways I got it to build, however when I try to compile code (example:) gcc -fopenacc -foffload=nvptx-none -o foo foo.c with it I get this error: lto-wrapper fatal error…
meh93
  • 311
  • 4
  • 13
0
votes
1 answer

Modifying loop variable (index) in OpenACC

I have a situation that I need to repeat a specific iteration of the loop multiple times. So, in that specific iteration, I am reducing the index one step so that next increment of the loop index makes no difference. This approach, which is the…
mgNobody
  • 738
  • 7
  • 23
0
votes
1 answer

At what code complexity does an OpenACC kernel lose efficiency on common GPU?

At about what code complexity do OpenACC kernels lose efficiency on common GPU and register, shared memory operations or some other aspect starts to bottleneck performance? Also is there some point where too few tasks and overhead of transferring to…
Flow
  • 19
  • 6
0
votes
1 answer

Shared variables and OpenACC

In OpenMP, one can use shared variables in a loop by #pragma omp parallel for shared(foo) private(bar) In OpenACC we have a private clause, but no shared clause. There are data clauses such as copy, copyin, copyout on the other hand. Sometimes, we…
marmistrz
  • 5,974
  • 10
  • 42
  • 94
0
votes
1 answer

How can I check if OpenACC works on my computer?

I know that GCC 6.x has a decent OpenACC support. But I want to make sure if it works correctly on my computer. I tried #include #include int main(int argc, char *argv[]) { acc_device_t dev = acc_get_device_type(); …
marmistrz
  • 5,974
  • 10
  • 42
  • 94