8

As title, I would like to know the right execution order in case we have a 3d block

I think to remember that I read already something regarding it, but it was some time ago, I dont remember where but it was coming by someone who didnt look so reliable..

Anyway I would like to have some confirmations about it.

Is it as the following (divided in warps)?

[0, 0, 0]...[blockDim.x, 0, 0] - [0, 1, 0]...[blockDim.x, 1, 0] - (...) - [0, blockDim.y, 0]...[blockDim.x, blockDim.y, 0] - [0, 0, 1]...[blockDim.x, 0, 1] - (...) - [0, blockDim.y, 1]...[blockDim.x, blockDim.y, 1] - (...) - [blockDim.x, blockDim.y, blockDim.z]

talonmies
  • 70,661
  • 34
  • 192
  • 269
elect
  • 6,765
  • 10
  • 53
  • 119

1 Answers1

8

Yes, that is the correct ordering; threads are ordered with the x dimension varying first, then y, then z (equivalent to column-major order) within a block. The calculation can be expressed as

int threadID = threadIdx.x + 
               blockDim.x * threadIdx.y + 
               (blockDim.x * blockDim.y) * threadIdx.z;

int warpID = threadID / warpSize;
int laneID = threadID % warpsize;

Here threadID is the thread number within the block, warpID is the warp within the block and laneID is the thread number within the warp.

Note that threads are not necessarily executed in any sort of predicable order related to this ordering within a block. The execution model guarantees that threads in the same warp are executed "lock-step", but you can't infer any more than that from the thread numbering within a block.

haccks
  • 104,019
  • 25
  • 176
  • 264
talonmies
  • 70,661
  • 34
  • 192
  • 269