Questions tagged [openacc]

The OpenACC Application Program Interface describes a collection of compiler directives to specify loops and regions of code in standard C, C++ and Fortran to be offloaded from a host CPU to an attached accelerator, providing portability across operating systems, host CPUs and accelerators.

The OpenACC Application Program Interface describes a collection of compiler directives to specify loops and regions of code in standard C, C++ and Fortran to be offloaded from a host CPU to an attached accelerator, providing portability across operating systems, host CPUs and accelerators.

Useful Links

The OpenACC directives and programming model allow programmers to create high-level host+accelerator programs without the need to explicitly initialize the accelerator, manage data or program transfers between the host and accelerator, or initiate accelerator startup and shutdown.

All of these details are implicit in the programming model and are managed by the OpenACC API-enabled compilers and runtimes. The programming model allows the programmer to augment information available to the compilers, including specification of data local to an accelerator, guidance on mapping of loops onto an accelerator, and similar performance-related details.

How to get Useful Answers to your OpenACC Questions on StackOverflow

Here are a number of suggestions to users new to OpenACC and/or StackOverflow. Follow these suggestions before asking your question and you are much more likely to get a satisfactory answer!

  • Search StackOverflow (and the web!) for similar questions before asking yours
  • Include an as-simple-as-possible code example in your question and you are much more likely to get a useful answer. If the code is short and self-contained (so users can test it themselves), that is even better.
403 questions
0
votes
1 answer

how to resolve scalar last value needed in openACC

While compiling openacc code, i am getting following warnings 215, Scalar last value needed after loop for x at line 239 Scalar last value needed after loop for y at line 239 Scalar last value needed after loop for x at line 240 …
0
votes
1 answer

Correct use of device_type in OpenACC

I have a for loop and I want to parallelize it with OpenACC if the target hardware is NVIDIA, or run it serially when the target hardware is AMD. I tried the following: #pragma acc loop \ device_type(tesla) parallel \ device_type(radeon)…
AstrOne
  • 3,569
  • 7
  • 32
  • 54
0
votes
1 answer

openacc nested loops with dynamic arrays

I am trying to apply openacc to develop multicore and gpu accelerated binaries. I have read the Farber book and successfully ran test programs from there and through some on-line courses offered by NVIDIA. Then, I attempted to parallelize on our…
0
votes
0 answers

OpenACC dependency error

Please I need some help here: void RBM::sample_v_given_h(int *h0_sample, double *mean, int *sample) { int c = 0; double r; for(int i=0; i
0
votes
1 answer

report PGCC-S-0000-Internal errors for _mp_malloc while there's no heap allocations

When I tried to compile my code in OpenACC, it reports: PGCC-S-0000-Internal compiler error. Call in OpenACC region to support routine - _mp_malloc (/home/lisanhu/mine/ws/C/AccSeqC/as_align.cc: 92) PGCC-S-0155-Compiler failed to translate…
Sanhu Li
  • 402
  • 5
  • 11
0
votes
1 answer

openacc create data while running inside a kernels

I'm having a task that is to be accelerated by OpenACC. I need to do dynamic memory allocation within a kernel computation. I've built a simpler demo for it as following. #include using namespace std; #pragma acc routine seq int…
Sanhu Li
  • 402
  • 5
  • 11
0
votes
2 answers

RBM no improvement with OpenACC on the code yet

RBM algorithm is open source algorithm the source code is available here: https://github.com/yusugomori/DeepLearning/tree/master/cpp I tried to get improvement with OpenACC by different ways but the sequential code still better So can you tell me…
0
votes
1 answer

Line too long in PGI 16.9. How to solve?

USe the following dummy code to replicate the issue. program pp implicit none real*8,dimension(45) :: refPoints refPoints(:) = (/ -1.0 , 1.0 , 1.0 , -1.0 , -1.0 , 1.0 , 1.0 , -1.0 , 0.0 , 1.0 , 0.0 , -1.0 , 0.0 , 1.0 , 0.0 , -1.0 , -1.0 , 1.0…
0
votes
1 answer

openacc and cache tilling

----- example code ----------- for (body1 = 0; body1 < NBODIES; body1 ++) { for (body2=0; body2 < NBODIES; body2++) { OUT[body1] += compute(body1, body2); } } ----- blocking code------ for (body2 = 0; body2 < NBODIES; body2 +=…
Grizz
  • 17
  • 2
0
votes
1 answer

How can a Fortran-OpenACC routine call another Fortran-OpenACC routine?

I'm currently attempting to accelerate a spectral element fluids solver by porting most of the routines to a GPGPU using OpenACC with the PGI (15.10) compiler. The source code is written in OO-Fortran. This software has "layers" of subroutines that…
0
votes
1 answer

OpenACC and GNU Scientific Library - data movement of gsl_matrix

I've watched the recorded openacc overview course videos up to lecture 3, which talks about expressing data movement. How would you move a gsl_matrix* from cpu to gpu using copy_in(). For example on the CPU I can do something like, gsl_matrix *Z =…
navmendoza
  • 21
  • 1
  • 5
0
votes
1 answer

Join array results in OpenACC

I'm writing an OpenACC code that has an array dependence. Each iteration of inner loop can update the same position of array. Here's some code: long unsigned int digits[d + 11]; for (long unsigned int digit = 0; digit < d + 11; ++digit) …
0
votes
0 answers

OpenACC bitonic sort is much slower on GPU than on CPU

I have the following bit of code to sort double values on my GPU: void bitonic_sort(double *data, int length) { #pragma acc data copy(data[0:length], length) { int i,j,k; for (k = 2; k <= length; k *= 2) { for (j=k >> 1; j > 0; j =…
0
votes
1 answer

OpenACC and floor/ceil functions

I want to use floor/ceil functions of C in an OpenACC project. When I want to make an atomic update of a value. #pragma acc atomic update x=floor(x)+c the compiler shows the following message: PGCC-S-0155-Invalid atomic expression…
0
votes
1 answer

OpenACCArray swap function

while trying to create an object oriented OpenACC implementation I stumbled upon this question. From there I took the code provided by @mat-colgrove at the GTC15 (code available at http://www.pgroup.com/lit/samples/gtc15_S5233.tar). Since I am…
dwn
  • 413
  • 3
  • 12