0

I tried to link cuFFT statically.

nvcc -ccbin g++ -dc -O3 -arch=sm_35  -c fftStat.cu fftStat.o;
nvcc -ccbin g++ -dlink -arch=sm_35 fftStat.o -o link.o;
g++ main.cc link.o fftStat.o -lcudart -lcudadevrt -lcufft_static   -lculibos -ldl -pthread -lrt -L/usr/local/cuda-10.2/lib64 -o run

It gave me the following errors ( not showing all the errors)

/usr/local/cuda-10.2/lib64/libcufft_static.a(fft_dimension_class_multi.o): In function `__sti____cudaRegisterAll()':
fft_dimension_class_multi.compute_75.cudafe1.cpp:(.text+0xdad): undefined reference to `__cudaRegisterLinkedBinary_44_fft_dimension_class_multi_compute_75_cpp1_ii_466e44ab'
/usr/local/cuda-10.2/lib64/libcufft_static.a(fft_dimension_class_multi.o): In function `global constructors keyed to BaseListMulti::radices':
fft_dimension_class_multi.compute_75.cudafe1.cpp:(.text+0x1c8d): undefined reference to 
float_64bit_regular_RT_SM50_plus.compute_75.cudafe1.cpp:(.text+0x3d): undefined reference to `__cudaRegisterLinkedBinary_51_float_64bit_regular_RT_SM50_plus_compute_75_cpp1_ii_66731515'
/usr/local/cuda-10.2/lib64/libcufft_static.a(float_64bit_regular_RT_SM50_plus.o): In function `global constructors keyed to compile_unitsforce_compile_float_width64_t_regular_fft_kernels__SM50_unbounded()':
float_64bit_regular_RT_SM50_plus.compute_75.cudafe1.cpp:(.text+0x29d): undefined reference to `__cudaRegisterLinkedBinary_51_float_64bit_regular_RT_SM50_plus_compute_75_cpp1_ii_66731515'
/usr/local/cuda-10.2/lib64/libcufft_static.a(float_64bit_regular_RT_SM60_plus.o): In function `__sti____cudaRegisterAll()':
float_64bit_regular_RT_SM60_plus.compute_75.cudafe1.cpp:(.text+0x3d): undefined reference to `__cudaRegisterLinkedBinary_51_float_64bit_regular_RT_SM60_plus_compute_75_cpp1_ii_dbb979db'
/usr/local/cuda-10.2/lib64/libcufft_static.a(float_64bit_regular_RT_SM60_plus.o): In function `global constructors keyed to compile_unitsforce_compile_float_width64_t_regular_fft_kernels__SM60_unbounded()':
float_64bit_regular_RT_SM60_plus.compute_75.cudafe1.cpp:(.text+0x18d): undefined reference to `__cudaRegisterLinkedBinary_51_float_64bit_regular_RT_SM60_plus_compute_75_cpp1_ii_dbb979db'
/usr/local/cuda-10.2/lib64/libcufft_static.a(half_32bit_regular_RT_SM53_plus.o): In function `__sti____cudaRegisterAll()':
half_32bit_regular_RT_SM53_plus.compute_75.cudafe1.cpp:(.text+0x3d): undefined reference to `__cudaRegisterLinkedBinary_50_half_32bit_regular_RT_SM53_plus_compute_75_cpp1_ii_96a57339'
/usr/local/cuda-10.2/lib64/libcufft_static.a(half_32bit_regular_RT_SM53_plus.o): In function `global constructors keyed to compile_unitsforce_compile_half_width32_t_regular_fft_kernels__SM53_unbounded()':
half_32bit_regular_RT_SM53_plus.compute_75.cudafe1.cpp:(.text+0x1b0d): undefined reference to `__cudaRegisterLinkedBinary_50_half_32bit_regular_RT_SM53_plus_compute_75_cpp1_ii_96a57339'
/usr/local/cuda-10.2/lib64/libcufft_static.a(half_32bit_vector_RT_SM53_plus.o): In function `__sti____cudaRegisterAll()':
half_32bit_vector_RT_SM53_plus.compute_75.cudafe1.cpp:(.text+0x3d): undefined reference to 
dpRadix0343C_cb.compute_75.cudafe1.cpp:(.text+0xa54): undefined reference to `__cudaRegisterLinkedBinary_34_dpRadix0343C_cb_compute_75_cpp1_ii_b592a056'
collect2: error: ld returned 1 exit status

Dynamic linking works:

g++ main.cc link.o fftStat.o -lcudart -lcudadevrt -lcufft -L/usr/local/cuda-10.2/lib64 -o run

I followed this guide https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#code-changes-for-separate-compilation and this guide https://docs.nvidia.com/cuda/cufft/index.html#static-library but apparently something is missing.

user3786219
  • 177
  • 1
  • 2
  • 11

1 Answers1

2

Some of the things you are attempting to accomplish at final link need to be accomplished at device link (your 2nd step). The following seems to work for me:

$ cat fftStat.cu
#include <cufft.h>

void test(){

  cufftHandle h;
  cufftCreate(&h);
}

$ cat main.cpp
void test();

int main(){

  test();
}

$ nvcc -ccbin g++ -dc -O3 -arch=sm_35  -c fftStat.cu fftStat.o
$ nvcc -ccbin g++ -dlink -arch=sm_35 fftStat.o -o link.o -lcufft_static -lcudadevrt
$ g++ main.cpp link.o fftStat.o -L/usr/local/cuda-10.2/lib64   -lcufft_static -lcudart -lcudadevrt -lculibos -ldl -pthread -lrt  -o run

Note that I've also rearranged some link orders to account for link dependencies. This may or may not matter depending on your exact version of g++. Some of the needs here (e.g. -lcudadevrt at the device-link step) may be a function of your actual code, which you haven't shown. For the above code, that item is not actually necessary.

Robert Crovella
  • 143,785
  • 11
  • 213
  • 257
  • Thank you! I actually did not know that the device link stage ( 2nd stage in my example) requires additional links. In my defense I just followed this example: `nvcc --gpu-architecture=sm_50 --device-c a.cu b.cu ; nvcc --gpu-architecture=sm_50 --device-link a.o b.o --output-file link.o; nvcc --lib --output-file libgpu.a a.o b.o link.o g++ host.o --library=gpu --library-path= \ --library=cudadevrt --library=cudart` which is showed here https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#code-changes-for-separate-compilation – user3786219 Aug 06 '20 at 18:23