2

A warp is 32 threads. Does the 32 threads execute in parallel in a Multiprocessor? If 32 threads are not executing in parallel then there is no race condition in the warp. I got this doubt after going through the some examples.

einpoklum
  • 118,144
  • 57
  • 340
  • 684
kar
  • 2,505
  • 9
  • 30
  • 32
  • Seems to be a duplicate of: http://stackoverflow.com/questions/5268103/cuda-threads-in-a-wrap Why did you ask the same question twice? – Himadri Choudhury Mar 11 '11 at 04:48
  • Please don't ask duplicate questions, just edit this one. As you had answers on the other, I've merged them. – Tim Post Mar 11 '11 at 08:27

2 Answers2

4

In the CUDA programming model, all the threads within a warp run in parallel. But the actual execution in hardware may not be parallel because the number of cores within a SM (Stream Multiprocessor) can be less than 32. For example, GT200 architecture have 8 cores per SM, and the threads within a warp would need 4 clock cycles to finish the execution.

If multiple threads write to the same location (either shared memory or global memory), and if you don't want race, then you have to use atomic operations or locks, because CUDA programming model does not guarantee which thread is going to write.

Neo Hpcer
  • 415
  • 1
  • 5
  • 9
-2

Yes. The 32 threads in a WARP will execute in parallel. The GPU is a SIMT (single instruction multiple thread) machine, single instruction which is executed by multiple threads in parallel.

Btw, SIMT is somewhat of a marketing term, it is basically the same as SIMD.

Himadri Choudhury
  • 10,217
  • 6
  • 39
  • 47
  • I read that in each Multiprocessor there is 8 scalar processor how could that possible for 32 threads execute in parallel – kar Mar 11 '11 at 04:39
  • 8 same kind of scalar processor – kar Mar 11 '11 at 04:44
  • 2
    1 scalar processor handles 4 threads concurrently, in a way that it is invisible for the programmer. Note that fermi cards have more SP in a multiprocessor and those can actually execute 2 warps at a time (independently) – CygnusX1 Mar 11 '11 at 07:35
  • This answer is technically incorrect. Physically it's often the case that the threads in a warp do _not_ execute in parallel. – einpoklum Nov 09 '16 at 09:22