5

I am using tesla k20 with compute capability 35 on Linux with CUDA 5.With a simple child kernel call it gives a compile error : Unresolved extern function cudaLaunchDevice

My command line looks like:

nvcc --compile -G -O0 -g -gencode arch=compute_35 , code=sm_35 -x cu -o fill.cu fill.o

I see cudadevrt.a in lib64.. Do we need to add it or what coukd be done to resolve it? Without child kernel call everything works fine.

talonmies
  • 70,661
  • 34
  • 192
  • 269
Zahid
  • 81
  • 1
  • 7

2 Answers2

11

You must explicitly compile with relocatable device code enabled and link the device runtime library in order to use dynamic parallelism. So your compilation command must include --relocatable-device-code true and the linking command (which you haven't shown us) should include -lcudadevrt.

This procedure is described in detail in the "TOOLKIT SUPPORT FOR DYNAMIC PARALLELISM" section of the Dynamic Parallelism Programming Guide pdf, available here.

talonmies
  • 70,661
  • 34
  • 192
  • 269
  • i already added the library in eclipse cuda nvcc linker libraries but still it gives error. – Zahid Dec 16 '12 at 00:06
  • Now command line looks nvcc --compile -G -O0 -g -gencode arch=compute_35, code=sm_35 -x cu -o "fill.o" ../fill.cu -lcudadevrt i tried both library path and copying cudadevrt lib to project file. still error exist.. – Zahid Dec 16 '12 at 00:29
  • @Zahid: The command you are writing is only compiling device code to an object file. You need to add the -lcudadevrt to the command which links the application. Did you read the pdf I linked to? – talonmies Dec 16 '12 at 07:24
  • Its not reaching to nvcc linker. It stop after compile giving error. Do i need to select device linker mode as .. separate compilation? – Zahid Dec 16 '12 at 09:55
  • @Zahid: check my edit. I missed the fact you were not compiling with relocatable device code enabled. Add `-rdc=true` to the compile statement and -lcudavert to the linking statement. – talonmies Dec 16 '12 at 10:46
  • Thank you so much talonmies. compiled successfull. We can use separate compile mode in eclipse for same purpose. But now it has issues with launch and debug. thank a million for support. – Zahid Dec 16 '12 at 11:02
6

Perhaps I'm somewhat off topic, but I would like to mention that I had the same issue under Windows/Visual Studio 2010 and I have solved the problem using the last comment by talonmies in few steps.

1) View -> Property Pages
2) Configuration Properties -> CUDA C/C++ -> Common -> Generate Relocatable Device Code -> Yes (-rdc=true)
3) Configuration Properties -> CUDA C/C++ -> Device -> Code Generation -> compute_35,sm_35
4) Configuration Properties -> Linker -> Input -> Additional Dependencies -> cudadevrt.lib

I hope that this information is useful.

Vitality
  • 20,705
  • 4
  • 108
  • 146