I've been trying to run RAPIDS on Google Colab pro, and have successfully installed the cuml and cudf packages, however I am unable to run even the example scripts.
TLDR;
Anytime I try to run the fit function for cuml on Google Colab I get the following error. I get this when using the demo examples both for installation and then for cuml. This happens for a range of cuml examples (I first hit this trying to run UMAP).
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-3-c06fc2c31ca3> in <module>()
13 knn.fit(X_train, y_train)
14
---> 15 knn.predict(X_test)
5 frames
cuml/neighbors/kneighbors_regressor.pyx in cuml.neighbors.kneighbors_regressor.KNeighborsRegressor.predict()
cuml/neighbors/nearest_neighbors.pyx in cuml.neighbors.nearest_neighbors.NearestNeighbors.kneighbors()
cuml/neighbors/nearest_neighbors.pyx in cuml.neighbors.nearest_neighbors.NearestNeighbors._kneighbors()
cuml/neighbors/nearest_neighbors.pyx in cuml.neighbors.nearest_neighbors.NearestNeighbors._kneighbors_dense()
/usr/local/lib/python3.7/site-packages/cuml/common/array.py in full(cls, shape, value, dtype, order)
326 """
327
--> 328 return CumlArray(cp.full(shape, value, dtype, order))
329
330 @classmethod
TypeError: full() takes from 2 to 3 positional arguments but 4 were given
Steps taken on Google Colab Pro (to reproduce error)
Here's an example, I install the relevant packages using this example from Rapids (https://colab.research.google.com/drive/1rY7Ln6rEE1pOlfSHCYOVaqt8OvDO35J0#forceEdit=true&offline=true&sandboxMode=true):
# Install RAPIDS
!git clone https://github.com/rapidsai/rapidsai-csp-utils.git
!bash rapidsai-csp-utils/colab/rapids-colab.sh stable
import sys, os, shutil
sys.path.append('/usr/local/lib/python3.7/site-packages/')
os.environ['NUMBAPRO_NVVM'] = '/usr/local/cuda/nvvm/lib64/libnvvm.so'
os.environ['NUMBAPRO_LIBDEVICE'] = '/usr/local/cuda/nvvm/libdevice/'
os.environ["CONDA_PREFIX"] = "/usr/local"
for so in ['cudf', 'rmm', 'nccl', 'cuml', 'cugraph', 'xgboost', 'cuspatial']:
fn = 'lib'+so+'.so'
source_fn = '/usr/local/lib/'+fn
dest_fn = '/usr/lib/'+fn
if os.path.exists(source_fn):
print(f'Copying {source_fn} to {dest_fn}')
shutil.copyfile(source_fn, dest_fn)
# fix for BlazingSQL import issue
# ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by /usr/local/lib/python3.7/site-packages/../../libblazingsql-engine.so)
if not os.path.exists('/usr/lib64'):
os.makedirs('/usr/lib64')
for so_file in os.listdir('/usr/local/lib'):
if 'libstdc' in so_file:
shutil.copyfile('/usr/local/lib/'+so_file, '/usr/lib64/'+so_file)
shutil.copyfile('/usr/local/lib/'+so_file, '/usr/lib/x86_64-linux-gnu/'+so_file)
Then I try and run the example below from cuML (https://docs.rapids.ai/api/cuml/stable/api.html#k-means-clustering)
from cuml.neighbors import KNeighborsRegressor
from sklearn.datasets import make_blobs
from sklearn.model_selection import train_test_split
X, y = make_blobs(n_samples=100, centers=5,
n_features=10)
knn = KNeighborsRegressor(n_neighbors=10)
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.80)
knn.fit(X_train, y_train)
knn.predict(X_test)
This will result in the error at the start of the question.