0

I have written a trivial kernel in which I declare my shared memory array as

extern __shared__ float As[100];

In my kernel launch I specify the number_of_bytes of shared memory. I get the error "Unexpected address space" while compiling the kernel(to PTX). I am using fairly new version of LLVM from svn(3.3 in progress). Any ideas what I am doing wrong here ? the problem seems to be with extern keyword, but then how else am I gonna specify it?(Shared memory). Should I use a different LLVM build?

Config CUDA 5.0 , Nvidia Tesla C1060

lucent
  • 124
  • 8
  • try this: `extern __shared__ float As[];` – sgarizvi Mar 07 '13 at 06:56
  • So you are using your own LLVM build, not anything from the CUDA toolkit? – talonmies Mar 07 '13 at 09:54
  • maybe you could provide a more complete source code. – RoBiK Mar 07 '13 at 09:57
  • Does your code build correctly with the default CUDA 5 toolkit? – Robert Crovella Mar 07 '13 at 15:57
  • @talonmies Not my own but from the LLVM svn taken about a month ago. Robert Crovella : Well nvcc throws "error: __local__ and __shared__ variables cannot have external linkage" – lucent Mar 08 '13 at 16:08
  • Although you state that nvcc throws this error, I suspect it is for the case where you are specifying both `extern` and `As[100]` When using the extern/dynamic method, you must specify in the format given by sgar91. Here's the [docs on that subject](http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#shared). If you had compiled the line as sgar91 indicated, you would not get that error, I don't think. And if you compile that line the way sgar91 indicated, with your specific LLVM, and it throws some other error, then that LLVM build is suspect, I think. – Robert Crovella Mar 08 '13 at 18:48

2 Answers2

1

Well, it runs out that extern keyword is not really required in this case as per Gert-Jan from Nvidia forum. I am not sure what his id is on SO.

His reply --

"If you know how many elements your shared memory array has (e.g. 100 elements), you should not use the extern keyword, and you don't have to specify the number of bytes of shared memory in the kernel launch (the compiler can figure it out by himself). Only if you don't know how many elements you will need, you have to specify this in the kernel launch, and in your kernel you have to write "extern shared float *As"."

Hope this help other users.

lucent
  • 124
  • 8
  • 2
    It's true that `extern` is not required if you know how much space is needed beforehand. But it should also be valid syntax to say `extern __shared__ float As[]` and specify the size at kernel invocation. So to me this answer explains nothing. sgar91 suggested you fix your extern syntax and you said you tried it and got the same error. It's understood that there are 2 types of valid syntax for specifying shared memory. Both should work. You tried one, it did not work, and you switched to the other and it did work. That does not explain why the first syntax is not working. – Robert Crovella Mar 08 '13 at 18:38
0

I am not sure if CUDA-C/C++ supports this but perhaps try to set the address space attribute as a work-around:

__attribute__((address_space(3)))
extern __shared__ float As[100];

That should force llvm to put it in shared address space.... Good luck!

fursund
  • 49
  • 3