I am working in Visual studio 2015.
I am using cuda 8.0.
My GPU supports compute capability 5.0 (GTX 960m)
I have been writing my code following nvidia guide.
I am trying to perform cuda separate compilation (4 .cu files). To gain access to functions declared in different .cu files I am using extern declarations to __device__
functions or __global__
kernels but i keep getting following errors:
1>GPU_Engine.cu.obj : error LNK2001: unresolved external symbol __cudaRegisterLinkedBinary_45_tmpxft_00001e30_00000000_8_GPU_Engine_cpp1_ii_1b52ddad
1>cplx.cu.obj : error LNK2001: unresolved external symbol __cudaRegisterLinkedBinary_39_tmpxft_00001150_00000000_8_cplx_cpp1_ii_I
1>basic.cu.obj : error LNK2001: unresolved external symbol __cudaRegisterLinkedBinary_40_tmpxft_00002648_00000000_8_basic_cpp1_ii_1458022c
1>time_evolution.cu.obj : error LNK2001: unresolved external symbol __cudaRegisterLinkedBinary_49_tmpxft_000022d0_00000000_8_time_evolution_cpp1_ii_df1c8d01
1>E:\0000_0003_Programs\Visual_Studio\Visual Studio 2015\Projects\GPU_Engine\x64\Release\GPU_Engine.exe : fatal error LNK1120: 4 unresolved externals
It looks to me like MSVC linking error not NVCC.
I have to point that I am using --device-c
flags in VS properties of my .cu files.
I am also concerned about project properties > cuda Linker > command line. I can find there linkage command for only one cuda object file. And i am not sure is it ok.
# (Approximate command-line. Settings inherited from host are not visible below.)
# (Please see the output window after a build for the full command-line)
"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin\nvcc.exe" -dlink -o x64\Release\GPU_Engine.device-link.obj -Xcompiler "/EHsc /nologo /Zi "
Simplified code:
I have 4 separate .cu files:
- GPU_Engine.h + GPU_Engine.cu: class definition (class member functions are using cuda kernels)
- Cplx.h + Cplx.cu: my complex type definition and support
- basic.h + basic.cu: basic functions of mathematical model of given physical system
- time_evolution.h + time_evolution.cu: specialized function for the model
GPU_Engine.h:
// nothing interesting...
GPU_Engine.cu:
// something before.
__device__
double potential(int& i, int& j, int& k) {
// do something.
}
__global__
void kernel_hamiltonian(Cplx* d_out, Cplx* d_psi, Cplx* d_lap) {
// do something.
}
// something after.
Cplx.h:
// type definition.
extern __device__ __constant__
Cplx I; // imaginary unit
// Cplx math support.
Cplx.cu:
__device__ __constant__
Cplx I; // cudaMemcpyToSymbol() inside GPU_Engine.cu in "start-up" section of code.
basic.h:
// nothing interesting...
basic.cu:
// something before.
extern __global__
void kernel_hamiltonian
// something after.
time_evolution.h:
// nothing interesting...
time_evolution.cu:
// something before.
extern __device__
double potential(int& i, int& j, int& k)
// something after