0

EDIT: Solved in the comments below.

I'm trying to get started with CUDA + RAPIDS. To do this, I've launched a VM on Google Compute using Ubuntu 18.04 and a NVIDIA Tesla K80. Here are the commands I've run in order to get the software installed:

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget http://developer.download.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-10-1-local-10.1.243-418.87.00/7fa2af80.pub
sudo apt-get update
sudo apt-get -y upgrade
sudo apt-get -y install cuda gcc
sudo apt-get -y autoremove

wget https://repo.anaconda.com/archive/Anaconda3-2019.07-Linux-x86_64.sh
bash Anaconda3-2019.07-Linux-x86_64.sh 
source ~/.bashrc

conda update -n base -c defaults conda

conda create --name test3.7 python=3.7
conda activate test3.7

conda install -c rapidsai -c nvidia -c numba -c conda-forge -c anaconda \
    cudf=0.9 cuml=0.9 cugraph=0.9 python=3.7 anaconda::cudatoolkit=10.0

conda install -c anaconda -c conda-forge -c plotly scipy chardet numpy pandas scikit-learn matplotlib plotly chart-studio

sudo shutdown -r now

I then try to run a small bit of Python using cudf instead of pandas, and I get the following error:

terminate called after throwing an instance of 'thrust::system::system_error'
  what(): parallel_for failed: no kernel image is available for execution on the device
Aborted (core dumped)

I'm not sure what I'm missing since I've read numerous guides that all say, "You just need to run these handful of commands, and you're good to go!" with the same commands listed. Most recently, I'm finding that I'm supposed to use nvcc to compile the CUDA drivers from source, but I can't find a guide anywhere that shows what commands to use (everyone just points to the several-hundred page long PDF by NVIDIA instead of actually providing a helpful command). So, what else do I need to do to get CUDA + RAPIDS running on an Ubuntu 18.04 system using a Tesla K80?

Thank you!

desertnaut
  • 57,590
  • 26
  • 140
  • 166
  • 1
    Rapids is probably not compiled for K80 cc 3.7. I think the general expectation is Pascal or newer cc 6.x or higher, for the compiled binary versions of Rapids. Take a look at the prerequisites at http://rapids.ai/start.html. easiest solution is probably to run on a machine with Pascal or newer GPU – Robert Crovella Sep 01 '19 at 02:12
  • Right you are - jeez, the requirements are certainly various and not spelled out in a single location. For future reference, use this page to find cards with correct Compute Capability version: https://developer.nvidia.com/cuda-gpus – Arthur Sommers Sep 01 '19 at 02:31

1 Answers1

2

If you install a binary version of RAPIDS (e.g. via pip, or conda), those packages expect a GPU of compute capability 6.0 or higher, currently, as indicated here. Of course this could change in the future.

To some degree this is a function of how the RAPIDS codes are compiled (and it may also mean RAPIDS is using features that may not be available in earlier GPUs), for those binaries. You may be able to change this by building RAPIDS from source, but usually there is a good reason that codes are compiled this way - typically meaning they use "newer" CUDA features not available in "older" GPUs.

Therefore the easiest way to fix this is probably to switch from a machine with Tesla K80 to a machine with Tesla P100 or newer GPUs.

Robert Crovella
  • 143,785
  • 11
  • 213
  • 257