1

I am trying to install apex on colab by Nvidia but failed several times. I tried number of different solutions including ones provided by Github official repository. I also tried answers provided here.

Every time I try I encounter error like this`

torch.__version__  = 1.9.0+cu102


/tmp/pip-req-build-xogkfxc5/setup.py:67: UserWarning: Option --pyprof not specified. Not installing PyProf dependencies!
  warnings.warn("Option --pyprof not specified. Not installing PyProf dependencies!")

Compiling cuda extensions with
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0
from /usr/local/cuda/bin

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/tmp/pip-req-build-xogkfxc5/setup.py", line 171, in <module>
    check_cuda_torch_binary_vs_bare_metal(torch.utils.cpp_extension.CUDA_HOME)
  File "/tmp/pip-req-build-xogkfxc5/setup.py", line 102, in check_cuda_torch_binary_vs_bare_metal
    raise RuntimeError("Cuda extensions are being compiled with a version of Cuda that does " +
RuntimeError: Cuda extensions are being compiled with a version of Cuda that does not match the version used to compile Pytorch binaries.  Pytorch binaries were compiled with Cuda 10.2.
In some cases, a minor-version mismatch will not cause later errors:  https://github.com/NVIDIA/apex/pull/323#discussion_r287021798.  You can try commenting out this check (at your own risk).
Running setup.py install for apex ... error ERROR: Command errored out with exit status 1: /home/liuyuan/anaconda3/envs/yolox/bin/python3.8 -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-req-build-xogkfxc5/setup.py'"'"'; __file__='"'"'/tmp/pip-req-build-xogkfxc5/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' --cpp_ext --cuda_ext install --record /tmp/pip-record-wm_6rdtx/install-record.txt --single-version-externally-managed --compile --install-headers /home/liuyuan/anaconda3/envs/yolox/include/python3.8/apex Check the logs for full command output.

As you can see I am using latest versions of pytorch and cuda and also considered previous versions but every solutions were failed.

Abdul Ahad
  • 31
  • 1
  • 3

1 Answers1

2

To built apex on Colab, the cuda version of PyTorch and your system must match, as explained here.

Note that, e. g., apex.optimizers.FusedAdam, apex.normalization.FusedLayerNorm, etc. require CUDA and C++ extensions.

You can built apex on Colab using the following simple steps:

Query the version Ubuntu Colab is running on:

!lsb_release -a

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.6 LTS
Release:    18.04
Codename:   bionic

To get the current cuda version run:

!nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0 

Look-up the latest PyTorch built and compute plattform here. enter image description here

Next, got to the cuda toolkit archive and configure a version that matches the cuda-version of PyTorch and your OS-Version. enter image description here

Copy the installation instructions:

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda-repo-ubuntu1804-11-7-local_11.7.0-515.43.04-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804-11-7-local_11.7.0-515.43.04-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu1804-11-7-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda

Remove Sudo and change the last line to include your cuda-version e.g., !apt-get -y install cuda-11-7 (without exclamation mark if run in shell directly):

!wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
!mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
!wget https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda-repo-ubuntu1804-11-7-local_11.7.0-515.43.04-1_amd64.deb
!dpkg -i cuda-repo-ubuntu1804-11-7-local_11.7.0-515.43.04-1_amd64.deb
!cp /var/cuda-repo-ubuntu1804-11-7-local/cuda-*-keyring.gpg /usr/share/keyrings/
!apt-get update
!apt-get -y install cuda-11-7

Your cuda version will now be updated:

!nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Jun__8_16:49:14_PDT_2022
Cuda compilation tools, release 11.7, V11.7.99
Build cuda_11.7.r11.7/compiler.31442593_0

Next, updated the outdated Pytorch version in Google Colab:

!pip install torch -U

Build apex. Depending on what you might require fewer global options:

!git clone https://github.com/NVIDIA/apex.git && cd apex && pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" --global-option="--fast_multihead_attn" . && cd .. && rm -rf apex

...
Successfully installed apex-0.1

You can now import apex as desired:

from apex import optimizers, normalization
...
KarelZe
  • 1,466
  • 1
  • 11
  • 21