0

My CUDA application has constant memory of less than 8KB. Since it will all be cached, do I need to worry about every thread accessing the same address for optimization?

If yes, how do I assure all threads are accessing the same address at the same time?

einpoklum
  • 118,144
  • 57
  • 340
  • 684

1 Answers1

6

Since it will all be cached, do I need to worry about every thread accessing the same address for optimization?

Yes. The cache itself can only serve up one 32-bit word per cycle.

If yes, how do I assure all threads are accessing the same address at the same time?

Ensure that whatever kind of indexing or addressing you use to reference an element in the constant memory area does not depend on any of the built in thread variables, e.g. threadIdx.x, threadIdx.y, or threadIdx.z. Note that the actual requirement is less stringent than this. You can achieve the necessary goal as long as the indexing evaluates to the same number for every thread in a given warp. Here are a few examples:

__constant__ int data[1024];
...
// assume 1D threadblock
int idx = threadIdx.x;
int bidx = blockIdx.x;
int a = data[idx];      // bad - every thread accesses a different element
int b = data[12];       // ok  - every thread accesses the same element
int c = data[b];        // ok  - b is a constant w.r.t threads
int d = data[b + idx];  // bad
int e = data[b + bidx]; // ok
int f = data[idx/32];   // ok - the same element is being accessed per warp
Robert Crovella
  • 143,785
  • 11
  • 213
  • 257
  • Will an unrolled for loop also work? I assume that would be like data[b]; b as a constant. – user2936659 Nov 21 '14 at 21:40
  • 2
    As long as the variable used to index the data does not depend on thread indices, there is no issue in a for loop, whether unrolled or not. It's ok, for example, if the variable used to index the data depends on the for-loop iteration variable. You should be able to figure this out, or any other example, by simply asking yourself whether the indexing variable depends on thread indices, or not. – Robert Crovella Nov 21 '14 at 21:42