CUDA global atomic operations across concurrent kernel executions

Question

My CUDA application performs an associative reduction over a volume. Essentially each thread computes values which are atomically added to overlapping locations of the same output buffer in global memory.

Is it possible to concurrently launch this kernel with different input parameters and the same output buffer? In other words, each kernel would share the same global buffer and write to it atomically.

All kernels are running on the same GPU.

score 1 · Accepted Answer · answered Aug 10 '19 at 11:01

1

Yes, it's possible. atomic operations to global memory are device-wide. They will be atomic with respect to any code running on the device.

answered Aug 10 '19 at 11:01

Robert Crovella

143,785
11
213
257

CUDA global atomic operations across concurrent kernel executions

1 Answers1