CUDA multiple threads work with one pointer

Question

Well, for example I have some array Y and I want to increment Y[0] in multiple threads. If I only make Y[0]++ in __global__ function then Y[0] will be 1. So, how to resolve this?

one approach would be to use [atomics](https://stackoverflow.com/questions/20726299/how-does-warp-work-with-atomic-operation/20726558#20726558). Another approach would be a [classical parallel reduction](https://developer.download.nvidia.com/assets/cuda/files/reduction.pdf). This is a fairly basic concept, and so variants of this question have been asked many times here on the `cuda` tag. — Robert Crovella, Nov 27 '18 at 19:56

Gardener · Accepted Answer · 2018-11-27T22:53:41.493

3

Atomic operations are implementation dependent. If this compiles with no warnings, it is likely to work, but should be tested :-), or at least examine the assembler.

__global__ void mykernel(int *value){
    int my_old_val = atomicAdd(value, 1);
}

See the guide here

edited Nov 27 '18 at 22:53

answered Nov 27 '18 at 20:04

Gardener

2,591
1
13
22

1

OP asks for CUDA not C – Ajay Brahmakshatriya Nov 27 '18 at 20:55

CUDA multiple threads work with one pointer

1 Answers1