-2

A kernel with a shared array and a couple of local ints:

__global__ void myKern()
{

gloablID = ....; //initialize gloabl thread ID

__shared__ int TMS[3]; //populate shared array in a simple way

  if (globalID == 0)
  {
    TMS[0] = 0;
    TMS[1] = 1;
    TMS[2] = 2;
  }
__syncthreads();

int val0 = 69;
int val1 = 36;
int val2 = 92;

int random_number = .... //use cuRand to get a random number between 0 and 3

int output = TMS[random_number];
//at this point, I want the variable "output" to be used to access one of my local ints. 
//For example, if "output" = 2, I want to be able to print val2 to screen.
//In a fantasy computer language this might look something like:
//std::cout<< "val" + "output"; 
//I just want 92 to be printed to the screen.

???

}

This may seem like an odd algorithm but if I can do this, it will allow me to combine the speed of registers with the large size of the shared cache in my CUDA project. Please no bruteforce binary solutions since I will be using a shared array of size 2698 with 33 local variables.

Jordan
  • 305
  • 3
  • 13
  • 1
    Could you please clarify what do you really need? You say: _if "output" = 2, I want to be able to print val2 to screen_. Then you say: _std::cout<< "val" + "output"_. It seems that you want to deal with a bunch of register variables as they were a unique array and exloit array pointer arithmetics? – Vitality Jul 03 '14 at 05:33
  • Sorry if it's unclear, it's difficult to explain. Maybe this will clarify: – Jordan Jul 03 '14 at 05:41
  • If output =0, I want 69 printed. – Jordan Jul 03 '14 at 05:41
  • If output = 1, I want 36 printed. If output = 2, I want 92 printed. – Jordan Jul 03 '14 at 05:42
  • This setup essentially mimics an array of pointers where the shared array holds all the pointers and the local variables are the values at those pointers' locations. – Jordan Jul 03 '14 at 05:50
  • 1
    Then I would say that there is not much related to CUDA in this post and your question is a standard C/C++ one. I do not see other possibilities than using an `if-else` or a `switch` or, more compactly, `(output==0)*(val0)+(output==1)*val1+(output==2)*val2`, although this latter solution can be less efficient. I don't think it will be possible to exploit C/C++ pointer arithmetics in your case as you are seeking to do. – Vitality Jul 03 '14 at 05:51
  • 2
    This strikes me as a misguided/premature attempt at optimization. "Combining the speed of registers with the large size of shared" makes no sense if your algorithm requires a read of shared memory *and* a register read. Why not just set TMS[0] = 69, TMS[1] = 36, etc. Now the lookup only requires a read of shared memory. There is no way at the source code level to create an indexed lookup into a set of discrete registers/register variables. – Robert Crovella Jul 03 '14 at 05:52
  • It's really complicated Robert, the local variables are changing on my code many thousands of times and to put these things in shared memory would require waiting for threads to synch and penalties from bank conflicts. – Jordan Jul 03 '14 at 05:57
  • 1
    bank conflicts: `int output = TMS[random_number];` – Robert Crovella Jul 03 '14 at 06:36
  • Despite having read this question and all the comments through several times, I still have no idea what the question here is, what it is you are trying to do and why you are trying to do it. If you can't articulate what it is you are asking, I suggest going away and having a think about your question, and then come back when you have something more concrete to ask. Vote to close.... – talonmies Jul 03 '14 at 09:43
  • 2
    [This](http://stackoverflow.com/questions/24546385/accessing-a-variable-in-c-by-stitching-its-name-together) looks like one of your questions. – Robert Crovella Jul 03 '14 at 18:53
  • @RobertCrovella Interesting, but how is it possible to reproduce the `map` functionality in a `__global__` function? Will Thrust help on this point? And what about performance? – Vitality Jul 04 '14 at 05:29
  • 2
    @JackOLantern: I am not aware of any dictionary like functionality for CUDA device code. I think that is an example of "ask a C++ question, get a C++ answer". – talonmies Jul 04 '14 at 08:41
  • I was pointing out the question, primarily, not any of the answers to it. I consider the question ("can I reference a variable with 'varname' + 'number' ?") to be a somewhat unusual one, especially so since it was nearly coincident with this one, which seemed to be asking almost the same thing. And I consider the principal gist of the answer to be "no", which I think is fairly evident to any experienced C programmer. I'm not suggesting that the description of `std::map` is applicable to CUDA. Independently, I've seen some indications that OP may be using multiple accounts on SO. – Robert Crovella Jul 04 '14 at 10:28

1 Answers1

1

You can use the following:

int vals[] = { 69, 36, 92 };
int random_number = ....;
int output = TMS[random_number];

int chosen = vals[output];

and this assumes the random number is between 0 and 2

MrDor
  • 98
  • 5
  • The problem with an array solution like this is that the values of the array cannot be stored in registers :( – Jordan Jul 05 '14 at 20:05
  • My mistake, I just looked it up and apparently you CAN store small arrays in registers. Thank you! – Jordan Jul 05 '14 at 20:28