I am trying build cuda-enabled JAX from source on a cluster with CentOS version7. In the jax home directory, I run:
python build/build.py --enable_cuda --cuda_path=$CUDA_HOME --cudnn_path=$CUDNN_HOME
Here are my specs:
- cuda version: 11.6
- cudnn version 7.5
- gcc version: 11.2.0
- Bazel binary path: ./bazel-5.1.1-linux-x86_64
- Bazel version: 5.1.1
- Python binary path: ~/.conda/envs/JAX/bin/python
- Python version: 3.9
- NumPy version: 1.21.5
- MKL-DNN enabled: yes
- Target CPU: x86_64
- Target CPU features: release
- CUDA enabled: yes
Here are my environment variables:
CC=/usr/local/app/compiler/gcc/11.2.0/bin/gcc CXX=/usr/local/app/compiler/gcc/11.2.0/bin/g++ INCLUDE=/usr/local/app/compiler/gcc/11.2.0/include LIBRARY_PATH=/usr/local/app/compiler/gcc/11.2.0/lib64:/usr/local/app/lib/nvidia/cuda/11.6.1/lib64:/usr/local/app/lib/nvidia/driver/510.47/lib64/nvidia LD_LIBRARY_PATH=/usr/local/app/compiler/gcc/11.2.0/lib64:/usr/local/app/lib/nvidia/cuda/11.6.1/lib64:/usr/local/app/lib/nvidia/driver/510.47/lib64/nvidia
The Bazel build throws this error:
ERROR: /lustre1/home/rice_cake/.cache/bazel/_bazel_rice_cake/8366d0a62cbb3b115627233e356374ab/external/zlib/BUILD.bazel:5:11: Compiling uncompr.c failed: undeclared inclusion(s) in rule '@zlib//:zlib': this rule is missing dependency declarations for the following files included by 'uncompr.c': '/usr/local/app/compiler/gcc/11.2.0/lib/gcc/x86_64-redhat-linux/11.2.0/include/stddef.h' '/usr/local/app/compiler/gcc/11.2.0/lib/gcc/x86_64-redhat-linux/11.2.0/include-fixed/limits.h' '/usr/local/app/compiler/gcc/11.2.0/lib/gcc/x86_64-redhat-linux/11.2.0/include-fixed/syslimits.h' '/usr/local/app/compiler/gcc/11.2.0/lib/gcc/x86_64-redhat-linux/11.2.0/include/stdarg.h'
It seems that Bazel is complaining that I do not declare dependency for the C++ library files. Is there a way to solve this?
The trouble is that these Bazel build files are automatically generated by jax/build/build.py, and I don't know how to fix them manually.
P.S. Building JAX from source is successful on my local machine..
Thanks very much for any help, Rice
I went into the Bazel cache files and checked the BUILD file, and it's the same as the on my local machine, so I'm a bit stuck. I am expecting that maybe setting environment variables might help Bazel build my JAX.