cuda 5.0 dynamic parallelism error: ptxas fatal . unresolved extern function 'cudaLaunchDevice

Question

I am using tesla k20 with compute capability 35 on Linux with CUDA 5.With a simple child kernel call it gives a compile error : Unresolved extern function cudaLaunchDevice

My command line looks like:

nvcc --compile -G -O0 -g -gencode arch=compute_35 , code=sm_35 -x cu -o fill.cu fill.o

I see cudadevrt.a in lib64.. Do we need to add it or what coukd be done to resolve it? Without child kernel call everything works fine.

talonmies · Accepted Answer · 2012-12-16T10:44:08.113

11

You must explicitly compile with relocatable device code enabled and link the device runtime library in order to use dynamic parallelism. So your compilation command must include --relocatable-device-code true and the linking command (which you haven't shown us) should include -lcudadevrt.

This procedure is described in detail in the "TOOLKIT SUPPORT FOR DYNAMIC PARALLELISM" section of the Dynamic Parallelism Programming Guide pdf, available here.

edited Dec 16 '12 at 10:44

answered Dec 15 '12 at 08:20

talonmies

70,661
34
192
269

i already added the library in eclipse cuda nvcc linker libraries but still it gives error. – Zahid Dec 16 '12 at 00:06
Now command line looks nvcc --compile -G -O0 -g -gencode arch=compute_35, code=sm_35 -x cu -o "fill.o" ../fill.cu -lcudadevrt i tried both library path and copying cudadevrt lib to project file. still error exist.. – Zahid Dec 16 '12 at 00:29
@Zahid: The command you are writing is only compiling device code to an object file. You need to add the -lcudadevrt to the command which links the application. Did you read the pdf I linked to? – talonmies Dec 16 '12 at 07:24
Its not reaching to nvcc linker. It stop after compile giving error. Do i need to select device linker mode as .. separate compilation? – Zahid Dec 16 '12 at 09:55
@Zahid: check my edit. I missed the fact you were not compiling with relocatable device code enabled. Add `-rdc=true` to the compile statement and -lcudavert to the linking statement. – talonmies Dec 16 '12 at 10:46
Thank you so much talonmies. compiled successfull. We can use separate compile mode in eclipse for same purpose. But now it has issues with launch and debug. thank a million for support. – Zahid Dec 16 '12 at 11:02

Vitality · Answer 2 · 2012-12-21T08:17:31.730

Perhaps I'm somewhat off topic, but I would like to mention that I had the same issue under Windows/Visual Studio 2010 and I have solved the problem using the last comment by talonmies in few steps.

1) View -> Property Pages
2) Configuration Properties -> CUDA C/C++ -> Common -> Generate Relocatable Device Code -> Yes (-rdc=true)
3) Configuration Properties -> CUDA C/C++ -> Device -> Code Generation -> compute_35,sm_35
4) Configuration Properties -> Linker -> Input -> Additional Dependencies -> cudadevrt.lib

I hope that this information is useful.

cuda 5.0 dynamic parallelism error: ptxas fatal . unresolved extern function 'cudaLaunchDevice

2 Answers2

Linked

Related