Questions tagged [openacc]

The OpenACC Application Program Interface describes a collection of compiler directives to specify loops and regions of code in standard C, C++ and Fortran to be offloaded from a host CPU to an attached accelerator, providing portability across operating systems, host CPUs and accelerators.

Useful Links

OpenACC.org: official OpenACC home
Information on OpenACC from NVIDIA
PGI Accelerator Compilers With OpenACC Directives
OpenACC in GTC
OpenACC talks at GTC
OpenACC Channel on YouTube

The OpenACC directives and programming model allow programmers to create high-level host+accelerator programs without the need to explicitly initialize the accelerator, manage data or program transfers between the host and accelerator, or initiate accelerator startup and shutdown.

All of these details are implicit in the programming model and are managed by the OpenACC API-enabled compilers and runtimes. The programming model allows the programmer to augment information available to the compilers, including specification of data local to an accelerator, guidance on mapping of loops onto an accelerator, and similar performance-related details.

How to get Useful Answers to your OpenACC Questions on StackOverflow

Here are a number of suggestions to users new to OpenACC and/or StackOverflow. Follow these suggestions before asking your question and you are much more likely to get a satisfactory answer!

Search StackOverflow (and the web!) for similar questions before asking yours
Include an as-simple-as-possible code example in your question and you are much more likely to get a useful answer. If the code is short and self-contained (so users can test it themselves), that is even better.

403 questions

votes

1 answer

NVIDIA GPU support branch prediction? (with OpenACC)

I'm using NVIDIA GPU with OpenACC (NVIDIA GeForce960, compiler:PGI 15.7) Does NVIDIA GPU support branch prediction? My code has conditional execution code in long loop. But when I run my code on GPU, It takes so long time. Below is Example code…

cuda nvidia openacc

asked Sep 16 '15 at 02:57

soongk

votes

2 answers

Using OpenACC to parallelize nested loops

I am very new to openacc and have just high-level knowledge so any help and explanation of what I am doing wrong would be appreciated. I am trying to accelerate(parallelize) a not so straightforward nested loop that updates a flattened (3D to 1D)…

c openacc pgi pgcc

asked Aug 24 '15 at 19:56

anupshrestha

votes

2 answers

Strong scaling on GPUs

I'd like to investigate the strong scaling of my parallel GPU code (written with OpenACC). The concept of strong scaling with GPUs is - at least as far as I know - more murky than with CPUs. The only resource I found regarding strong scaling on GPUs…

cuda parallel-processing gpgpu openacc

asked Nov 11 '14 at 09:07

lodhb

votes

0 answers

Thrust data transfers between host and device?

Here is the code which reproduces the unexplained behavior: main.cpp #include #include extern "C" int findme(float *ARRAY); int main(){ float *ARRAY = new float [10]; int position; ARRAY[0] = 97.7302; …

c++ cuda gpgpu thrust openacc

asked Oct 11 '14 at 13:37

lodhb

votes

1 answer

Reshaping A Dynamic Array Using Function Parameters

Today I found this is in an example file given to me by a company: void mySgemm( int m, int n, int k, float alpha, float beta, float a[m][n], float b[n][k], float c[m][k], int accelerate ) Called with: a_cpu = malloc(..); b_cpu =…

c++ openacc

asked Apr 11 '13 at 20:05

Constantin

16,812
9
34
52

votes

1 answer

Host data should be allocated for create and pcreate clauses?

I am currently studying the openacc API, and I was wondering if it is possible to create an array on the device without having any corresponding allocate array on the Host. Let's say that I want to use my old cuda kernel, and only handle memory…

openacc

asked Feb 06 '13 at 09:16

chabachull

votes

1 answer

Multi-dimensional array copy OpenACC

I have a 2D matrix SIZE x SIZE, which I'm trying to copy to the GPU. I allocate the matrix this way: #define SIZE 1024 float (*a)(SIZE) = (float(*)[SIZE]) malloc(SIZE * SIZE * sizeof(float)); And I have this on my ACC region: void…

cuda gpgpu openacc

asked Oct 14 '12 at 15:17

leo

1,117
1
8
18

vote

0 answers

libquadmath.o.dylib found by gcc, but not mpicc

I want to compile some igraph code within a file that uses MPI and OpenACC. Using an igraph example (“sparsemat2.c”), it compiles with “gcc”, but not “mpicc”. $ gcc sparsemat2.c -I/usr/local/include/igraph -o sparsemat2 -ligraph $ mpicc sparsemat2.c…

mpi igraph openacc

asked Jun 25 '23 at 23:13

Mark Bower

vote

0 answers

Issue with Writing Array Elements to File in OpenACC

Hello OpenACC experts, I'm facing a problem with writing array elements to a file using OpenACC. Here's the relevant code snippet: #include #include using namespace std; int main() { ofstream THeOutfile; …

c++ parallel-processing nvidia openacc

asked Jun 03 '23 at 09:10

yourmaster12321-at-hyphens

vote

1 answer

Unable to access CUDA device with OpenACC on WSL2 Ubuntu: Error code=34

I am new to using OpenACC on WSL2 with Ubuntu and have encountered an issue. I successfully installed the HPC SDK as instructed on the website, without installing CUDA separately, as the latest CUDA version was included with the HPC SDK. However,…

cuda nvidia wsl-2 openacc

asked Jun 01 '23 at 10:04

Aronld Manki

vote

0 answers

Use std::vector with OpenACC

I’m trying to compute on GPU, using OpenACC, the sum between two vectors of std::vector. As compiler I’m using GCC+NVPTX with OpenACC support but when I compile the code with these flags: g++ -fopenacc -offload=nvptx-none -fopt-info-optimized-omp -g…

gpu stdvector hpc nvcc openacc

asked Dec 29 '22 at 14:56

carbonaraHPC

vote

1 answer

OpenACC: Why updating an array depends on the location of the update directive

I'm new to openacc. I'm trying to use it to accelerate a particle code. However, I don't understand why when updating an array (eta in the program below) on the host, it gives different results depending on the location of '!$acc update self'. Here…

c++ fortran openacc

asked Nov 09 '22 at 11:38

FeyPhys

vote

1 answer

Compiling with PGI PGCC with LAPACK and LBLAS libraries?

I'm trying to compile my OpenACC parallel C++ program that makes use of dgemm (BLAS) and dgesvd (LAPACK) functions. I'm trying to compile the program with PGI PGCC compiler, linking it with the libraries like this (the program is called "VD"): #…

c++ nvidia lapack blas openacc

asked Jun 08 '22 at 16:49

gamersensual

vote

1 answer

How to apply cuda-memcheck to an app with piped inputs from standard I/O

I want to use cuda-memcheck for an app with standard I/O. The app, dut, reads standard Input and writes standard Output. cat input.txt | cuda-memcheck ./dut -dutoptions > output.txt In this case, the dut app seems to be launched, but cuda-memcheck…

shell cuda pipe openacc

asked Jun 03 '22 at 12:46

hakunom

vote

1 answer

How do I translate this simple OpenACC code to SYCL?

I have this code: #pragma acc kernels #pragma acc loop seq for(i=0; i

c++ parallel-processing translate openacc sycl

asked Apr 21 '22 at 09:31

gamersensual

Prev 1 2 3

…

26 27 Next