3

I don't get Tensorflow with Syntaxnet built with CUDA on Ubuntu 16.04. I have built it successfully without CUDA on this system.

Most likely the error is rooted in the configuration. The bazel build of tensorflow with CUDA generates linker commands for shared libraries with the linker option -pie for generating executables with position independent code. This causes the error "undefined reference to `main'".

/home/patrick/.cache/bazel/_bazel_patrick/5b9c9cf56f3e0138be05b0752b134bcb/external/com_google_absl/absl/base/BUILD.bazel:28:1: Linking of rule '@com_google_absl//absl/base:spinlock_wait' failed (Exit 1): 

    crosstool_wrapper_driver_is_not_gcc failed: error executing command 

  `(cd /home/patrick/.cache/bazel/_bazel_patrick/5b9c9cf56f3e0138be05b0752b134bcb `/execroot/__main__ && exec env - \
CUDA_TOOLKIT_PATH=/usr/local/cuda \
CUDNN_INSTALL_PATH=/usr/local/cuda \
GCC_HOST_COMPILER_PATH=/usr/bin/gcc \
LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64:/usr/local/cuda-9.0/extras/CUPTI/lib64:/usr/local/cuda-9.0/nvvm/lib64 \
NCCL_INSTALL_PATH=/usr \ PATH=/home/patrick/bin:/home/patrick/.local/bin:/usr/local/cuda/bin:/usr/bin:/bin \
    PWD=/proc/self/cwd \
    PYTHON_BIN_PATH=/usr/bin/python \
    PYTHON_LIB_PATH=/usr/local/lib/python2.7/dist-packages \
    TF_CUDA_CLANG=0 \
    TF_CUDA_COMPUTE_CAPABILITIES=6.1 \
    TF_CUDA_VERSION=9.0 \
    TF_CUDNN_VERSION=7 \
    TF_NCCL_VERSION=2 \
    TF_NEED_CUDA=1 \
    TF_NEED_OPENCL_SYCL=0 \
  external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -shared -o bazel-out/k8-opt/bin/external/com_google_absl/absl/base/libspinlock_wait.so -Wl,-no-as-needed -B/usr/bin/ -pie -Wl,-z,relro,-z,now -no-canonical-prefixes -pass-exit-codes '-Wl,--build-id=md5' '-Wl,--hash-style=gnu' -Wl,--gc-sections -Wl,@bazel-out/k8-opt/bin/external/com_google_absl/absl/base/libspinlock_wait.so-2.params)
/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/Scrt1.o: In function `_start':
(.text+0x20): undefined reference to `main'
collect2: error: ld returned 1 exit status

This linking command succeeds when removing the option -pie. Help would be appreciated to either find a way to edit the linker flags Bazel uses or to get a hint to the configuration error I made from users that encountered a similar problem. I don't think that posting the configuration steps I did will lead to other suggestions than the ones I already read on other posts. The build process looks too shaky for me. I already had a look at the definition in the CROSSTOOL and BUILD files. I did not edit them and they look Ok (-pie is only enabled for linking executables).

I work with

  • Bazel 0.15.2
  • Tensorflow 1.8.0
  • Ubuntu 16.04
  • gcc 5.4
  • CUDA 9.0
  • CUDNN 7.1
  • NCCL 2.1

0 Answers0