1

I have a CUDA kernel where the input is a global array and the output is another array of the same size. I'm trying to remap the values from one array to the other conditioned by the value of one of their neighbours. The pseudocode would look like this:

remap(int* in, int* out, int* counter) {
    idx = threadIdx
    q = index calculated based on idx
    if ( in[q] > 0 ) {
        newValue = in[q];
    } else {
        newValue = *counter;
        *counter += 1;
        in[q] = newValue;
    }
    out[idx] = newValue;
}

The problem with this code is that multiple threads will try to check and update the value of the q index simultaneously. The real code closer to what I want to achieve would be something like:

__global__
void remap(int* in, const int* out, int* counter) {
    idx = threadIdx;
    q = index calculated based on idx;
    newValue = atomicCAS(&in[q], 0, atomicAdd(counter, 1));
    out[idx] = newValue;
}X

According to the documentation, the above atomicCAS does compute (in[q] == 0) ? newCounter : in[q] and store the result as desired. The problem is that the newCounter = counter + 1 gets updated no matter what, whether the condition is met or not. I was looking for something along the lines:

(in[q] == 0) ? in[q] = (counter += 1) : in[q]

I do realise that the above is not syntactically correct, just trying to make it clear that I'm trying to atomize two update operations simultaneously, both the increment of counter and the update of in[q] but ONLY if a condition is met, if not I do not want either of them.


Hope the question is fairly clear. The question now would be, is this at all doable? Atomic update of two buffers if a condition is meet, and the value of the buffer otherwise. Or this is not doable in CUDA and I should be looking elsewhere.

Imanol Luengo
  • 15,366
  • 2
  • 49
  • 67
  • 1
    A critical section is probably your only real option and that is very tricky to get correct and has limited scalability – talonmies Sep 15 '17 at 12:23
  • 1
    You can't do an atomic update of 2 *separated locations in memory*. But with a bit of data reorganization you may be able to use a [custom atomic operation](https://stackoverflow.com/questions/17411493/how-can-i-implement-a-custom-atomic-function-involving-several-variables/17414007#17414007) to update 2 *variables*. – Robert Crovella Sep 15 '17 at 15:09

0 Answers0