Questions tagged [openacc]

The OpenACC Application Program Interface describes a collection of compiler directives to specify loops and regions of code in standard C, C++ and Fortran to be offloaded from a host CPU to an attached accelerator, providing portability across operating systems, host CPUs and accelerators.

The OpenACC Application Program Interface describes a collection of compiler directives to specify loops and regions of code in standard C, C++ and Fortran to be offloaded from a host CPU to an attached accelerator, providing portability across operating systems, host CPUs and accelerators.

Useful Links

The OpenACC directives and programming model allow programmers to create high-level host+accelerator programs without the need to explicitly initialize the accelerator, manage data or program transfers between the host and accelerator, or initiate accelerator startup and shutdown.

All of these details are implicit in the programming model and are managed by the OpenACC API-enabled compilers and runtimes. The programming model allows the programmer to augment information available to the compilers, including specification of data local to an accelerator, guidance on mapping of loops onto an accelerator, and similar performance-related details.

How to get Useful Answers to your OpenACC Questions on StackOverflow

Here are a number of suggestions to users new to OpenACC and/or StackOverflow. Follow these suggestions before asking your question and you are much more likely to get a satisfactory answer!

  • Search StackOverflow (and the web!) for similar questions before asking yours
  • Include an as-simple-as-possible code example in your question and you are much more likely to get a useful answer. If the code is short and self-contained (so users can test it themselves), that is even better.
403 questions
0
votes
1 answer

Makefile to link against Armadillo library using the PGI compiler

I am trying to use the -U__GNUG__ flag of the PGI compiler in a Makefile, attempting to compile all the .cpp files within a folder, linking against Armadillo(Using the g++ compiler, the code compiles and run.) The Makefile that I have is as…
user3116936
  • 492
  • 3
  • 21
0
votes
1 answer

Can the PGI compilers output the generated Cuda code to a file

I would like the generated CUDA code to be saved in a file for examination. Is this possible with OpenAcc and PGI compilers?
0
votes
1 answer

How to copy user-defined types by OpenACC

I am using PGI 15.7 compiler. I am wondering to know how I can copy a defined structure from CPU to GPU by OpenACC. typedef struct str_ { int n; int m; int* col; // size [n*m] double* val; // size [n*m] }str; Thank you very…
behzad baghapour
  • 127
  • 2
  • 11
0
votes
1 answer

Remove data dependency in C

I have a code which has data dependency. code example: int data[1000*3]; int result[1000]={0,}; // initialize data[] for(i=0; i<1000; i++) { a = data[i*3 + 0]; b = data[i*3 + 1]; c = data[i*3 + 2]; if( (a>b) && (a>c) ) // This line…
soongk
  • 259
  • 3
  • 17
0
votes
2 answers

Can I profile OpenACC kernel in C source code level?

I'm trying to speed-up my code with openacc with PGI 15.7 compiler. I want to profile my code in C source level. I'm using 'nvvp' profiler from CUDA 7.0 When I run nvvp, I can use 'analysis tap' and can get which latency is the reason my code…
soongk
  • 259
  • 3
  • 17
0
votes
2 answers

Efficient use of memories in OpenACC

I am working on an OpenACC computational fluid dynamics code to increase the granularity of computations inside a loop by breaking down the overall computations to bunch of small operations. My final goal is to reduce the amount of registers per…
behzad baghapour
  • 127
  • 2
  • 11
0
votes
1 answer

OpenACC shared memory usage

I am working with openacc using pgi compiler. I want to know how I can profile the code about memory usage specially the shared memory at runtime? Thank you so much for your help! Behzad
behzad baghapour
  • 127
  • 2
  • 11
0
votes
2 answers

Can I use OpenACC to parallelize a big code which call some functions?

I'm trying to parallelize my sequencial C code and offload to NVIDIA GPU with OpenACC(PGI compiler) My code is written as a sequencial code. And calling very long functions frequently, like below. int main() { // blah blah... for(i=0; i<10;…
soongk
  • 259
  • 3
  • 17
0
votes
1 answer

Dynamic/Nested Parallelism of GPU with OpenMP programming model

I've question is related with declare target construct of OpenMP and dynamic/nested parallelism feature of GPUs. OpenACC 2.0 supports dynamic parallelism in two ways; routine directive and using parallel/kernels directives in nested way. But using…
grypp
  • 405
  • 2
  • 15
0
votes
3 answers

OpenACC compiler: How to download and use CAPS compiler

I want to write OpenACC programs, but I can not find a compiler to write this kind of program. The PGI compiler is not free for some countries like Iran. I want to ask how to download CAPS compiler, I can not find any link In one post linked to this…
rijisoft
  • 57
  • 2
  • 13
0
votes
3 answers

OpenACC 2.0 routine: data locality

Take the following code, which illustrates the calling of a simple routine on the accelerator, compiled on the device using OpenACC 2.0's routine directive: #include #pragma acc routine int function(int *ARRAY,int multiplier){ …
lodhb
  • 929
  • 2
  • 12
  • 29
0
votes
1 answer

OpenACC boundary issue

I am doing a very simple vector addition kernel in OpenACC. And I am wondering whether this is an issue with the compiler I am using (accULL with OpenCL), as I am having issue it seems copying data back to the host from the device. All the results…
Jacob
  • 3,521
  • 6
  • 26
  • 34
0
votes
1 answer

Does OpenACC take away from the normal GPU Rendering?

I'm trying to figure out if I can use OpenACC in place of normal CPU serial execution calls. Usually my programming is all about 3D programming, or uses the GPU normally in some way. I.E. Image processing, or some other type of rendering that…
pBlack
  • 102
  • 2
  • 10
0
votes
1 answer

How to interface OpenACC with cublasDgetrfBatched in Fortran?

I've been working on a Fortran code which uses the cuBLAS batched LU and cuSPARSE batched tridiagonal solver as part of a BiCG iterative solver with ADI preconditioner.I'm using a Kepler K20X with compute capability 3.5 and CUDA 5.5. I'm doing this…
jah
  • 3
  • 3
0
votes
1 answer

Does exist an OpenACC counterpart of the OpenMP directive THREADPRIVATE?

In my OpenMP project, I use a do loop over a threaded subroutine "tool" and am restricted to pass a single-variate function to the threaded subroutine "tool." However, in my mathematical model, the function has to take one more argument, so I need…
Li-Pin Juan
  • 1,156
  • 1
  • 13
  • 22