7

If I compile a CUDA program with a lower Compute Capability, e.g 1.3 (nvcc flag sm_13), and run it on a device with Compute Capability 2.1, will it exploit the features of Compute 2.1 or not?

In that situation, Will the compute 2.1 device behave like a compute 1.3 device?

sgarizvi
  • 16,623
  • 9
  • 64
  • 98

1 Answers1

10

No, it won't exploit any features you need to explicitly program for. Only those features that are transparent to the user (like cache or larger register files) will be used.

Additionally, you need to make sure your object file contains a version of the code compiled to the PTX intermediate language, that can be dynamically compiled to the target architecture, or you program will not even run.

Compile to a virtual architecture (nvcc -arch compute_13) to ensure that, or create a fat binary with code for multiple architectures using the -gencode option to nvcc.

With a fat binary, you can program for features available only on higher compute capability if you wrap the code inside #if __CUDA_ARCH__ >= xyz preprocessor conditionals.

tera
  • 7,080
  • 1
  • 21
  • 32
  • thankyou for the answer. I prefer creating fat binary with code for all compute capabilities. – sgarizvi Sep 14 '12 at 12:56
  • 1
    Note that the best option is to do both (create a fat binary that also contains a PTX version). That way the program will always run even on future GPUs with different instruction sets. – tera Sep 14 '12 at 13:36
  • The second part of my previous comment didn't format right. I have moved it to the answer instead. – tera Sep 14 '12 at 13:38