I want to use the SIMD video instructions (vadd4, vmax4 etc.) Section 8.7.13 in http://docs.nvidia.com/cuda/pdf/ptx_isa_3.1.pdf
I tried the following in my code
asm("vadd4.u32.u32.u32 %0, %1, %2, %3;" : "=r"(i) : "r"(j) : "r"(k) : "r"(l));
where i,j,k,l are int variables. I used "r", as it is the constraint for .u32 reg
But on compiling, I get the following error
error: unknown register name "r"
What should I use instead of "r" here? Or is there something else wrong in the code? (I am using a Tesla card with compute capability 3.5)