2

Does it make sense to use an unsigned short integer for registers (for saving register's memory) and shared memory (faster access) in CUDA programs?

I create template __device__ function (using registers and shared memory) and specialize it for uint and ushort. Use:

  • For uint: 25 registers and speed 460 MB/sec.
  • For ushort: 26 registers and speed 420 MB/sec.

So there is no reason to use unsigned short int.

paleonix
  • 2,293
  • 1
  • 13
  • 29
Alex
  • 12,578
  • 15
  • 99
  • 195
  • 2
    There might be need for `short int` if you have severe memory limitation. Other than that, there is no use for it, since, as talonmies pointed out, all registers are 32 bit. Also, using `short int`s can easily lead to bank conflicts in shared memory. – aland Oct 04 '12 at 19:25
  • Ie Does it make sense to store in a single 32-bit register 2 variables of 16-bit, to reduce count of used registers? – Alex Oct 04 '12 at 20:02
  • 1
    You can do that, but it requires extra arithmetic to unpack it. I doubt the compiler would even store a `short2` in a single register. – harrism Oct 04 '12 at 23:45
  • But why is it removed from SM2.X and why the compiler does not support it? And someone uses the packaged two registers in one? – Alex Oct 05 '12 at 16:05

1 Answers1

2

I don't have big experience with CUDA, but I've read, that we should avoid using unsigned types (Cuda C Best Practices Guide).

Using shared memory can be the best way to increase performance in our apps. You should think how to optimize your kernel. When you often read the same value from global memory or you need to use one thread in kernel (all kernel load data to SM and stop but first read data from SM not from global memory), use shared memory.

Everything depends on what do you want to do. If you want to optimize kernel, please post some code.

Tomasz Dzięcielewski
  • 3,829
  • 4
  • 33
  • 47
  • Yes, that's right for loop counters "9.3 LOOP COUNTERS SIGNED VS. UNSIGNED" if does not uses "pragma unroll". I want to reduce count of used registers. And I want to know two unsigned short integer variables take less registers than one sign an integer variable? – Alex Oct 04 '12 at 19:52