7

I am doing the following:

__shared__ int exForBlockLessThanP = totalElementLessThanPivotEntireBlock[blockIdx.x];

where totalElementLessThanPivotEntireBlock is an array on GPU. The compiler is throwing as error as stated in the title of the question. I really dont understand why this is a problem?

Programmer
  • 6,565
  • 25
  • 78
  • 125

1 Answers1

12

Static initialization of shared variables is illegal in CUDA. The problem is that the semantics of how every thread should treat static initialization of shared memory is undefined in the programming model. Which thread should do the write? What happens if the value is not uniform between threads? How should the compiler emit code for such a case and how should the hardware run it?

In your nonsensical example you are asking every thread in the block to initialize the same shared variable with a value -- basically a statically compiled memory race.

talonmies
  • 70,661
  • 34
  • 192
  • 269
  • PLease note. In the above, I am asking each thread in the block to assign exForBlockLessThanp the same value – Programmer Dec 24 '11 at 10:40
  • 1
    Use a conditional statement to have one thread do the initialization at the beginning of the kernel – talonmies Dec 24 '11 at 10:46
  • Does __shared__ int p; also face same problem as every thread in block does the same thing – Programmer Dec 24 '11 at 10:52
  • No, of course not. That is only a declaration - it does not generate any code. – talonmies Dec 24 '11 at 11:22
  • 5
    @talonmies Your comment on single-thread conditional initialization is the most important part of the answer, and buried in the comments. The tone of the answer is also quite aggressive and demeaning. – kmac Oct 18 '16 at 18:34