0

I am trying build cuda-enabled JAX from source on a cluster with CentOS version7. In the jax home directory, I run:

python build/build.py --enable_cuda --cuda_path=$CUDA_HOME --cudnn_path=$CUDNN_HOME

Here are my specs:

  • cuda version: 11.6
  • cudnn version 7.5
  • gcc version: 11.2.0
  • Bazel binary path: ./bazel-5.1.1-linux-x86_64
  • Bazel version: 5.1.1
  • Python binary path: ~/.conda/envs/JAX/bin/python
  • Python version: 3.9
  • NumPy version: 1.21.5
  • MKL-DNN enabled: yes
  • Target CPU: x86_64
  • Target CPU features: release
  • CUDA enabled: yes

Here are my environment variables:

CC=/usr/local/app/compiler/gcc/11.2.0/bin/gcc CXX=/usr/local/app/compiler/gcc/11.2.0/bin/g++ INCLUDE=/usr/local/app/compiler/gcc/11.2.0/include LIBRARY_PATH=/usr/local/app/compiler/gcc/11.2.0/lib64:/usr/local/app/lib/nvidia/cuda/11.6.1/lib64:/usr/local/app/lib/nvidia/driver/510.47/lib64/nvidia LD_LIBRARY_PATH=/usr/local/app/compiler/gcc/11.2.0/lib64:/usr/local/app/lib/nvidia/cuda/11.6.1/lib64:/usr/local/app/lib/nvidia/driver/510.47/lib64/nvidia

The Bazel build throws this error:

ERROR: /lustre1/home/rice_cake/.cache/bazel/_bazel_rice_cake/8366d0a62cbb3b115627233e356374ab/external/zlib/BUILD.bazel:5:11: Compiling uncompr.c failed: undeclared inclusion(s) in rule '@zlib//:zlib': this rule is missing dependency declarations for the following files included by 'uncompr.c': '/usr/local/app/compiler/gcc/11.2.0/lib/gcc/x86_64-redhat-linux/11.2.0/include/stddef.h' '/usr/local/app/compiler/gcc/11.2.0/lib/gcc/x86_64-redhat-linux/11.2.0/include-fixed/limits.h' '/usr/local/app/compiler/gcc/11.2.0/lib/gcc/x86_64-redhat-linux/11.2.0/include-fixed/syslimits.h' '/usr/local/app/compiler/gcc/11.2.0/lib/gcc/x86_64-redhat-linux/11.2.0/include/stdarg.h'

It seems that Bazel is complaining that I do not declare dependency for the C++ library files. Is there a way to solve this?

The trouble is that these Bazel build files are automatically generated by jax/build/build.py, and I don't know how to fix them manually.

P.S. Building JAX from source is successful on my local machine..

Thanks very much for any help, Rice

I went into the Bazel cache files and checked the BUILD file, and it's the same as the on my local machine, so I'm a bit stuck. I am expecting that maybe setting environment variables might help Bazel build my JAX.

rice_cake
  • 1
  • 2

1 Answers1

0

To fix the "undeclared inclusion(s) in rule", try removing the complete bazel cache files in "/root/.cache/bazel/". and for jax with bazel here we have good documentation.

SG_Bazel
  • 343
  • 2
  • 7