I am trying to understand nvcc compilation phases but I am a little bit confused. Because I don't know the exact hardware configuration of the machine that will run my software, I want to use JIT compilation feature in order to generate the best possible code for it. In the NVCC documentation I found this:
"For instance, the command below allows generation of exactly matching GPU binary code, when the application is launched on an sm_10, an sm_13, and even a later architecture:"
nvcc x.cu -arch=compute_10 -code=compute_10
So my understanding is that the above options will produce the best/fastest/optimum code for the current GPU. Is that correct? I also read that the default nvcc options are:
nvcc x.cu –arch=compute_10 -code=sm_10,compute_10
If the above is indeed correct, why I can't use any compute_20 features in my application?