0

I am trying to run in Jupyter notebook the example found here and copied below from the rapids cuML introduction on classification - it runs well with n_samples under 6000 (this parameter dictates the number of rows of the generated dataset)

import cuml
from cuml.datasets.classification import make_classification
from cuml.preprocessing.model_selection import train_test_split
from cuml.ensemble import RandomForestClassifier as cuRF
from sklearn.metrics import accuracy_score
from cupy import asnumpy

# synthetic dataset dimensions
n_samples = 1000
n_features = 10
n_classes = 2

# random forest depth and size
n_estimators = 25
max_depth = 10

# generate synthetic data [ binary classification task ]
X, y = make_classification ( n_classes = n_classes,
                             n_features = n_features,
                             n_samples = n_samples,
                             random_state = 0 )

X_train, X_test, y_train, y_test = train_test_split( X, y, random_state = 0 )

model = cuRF( max_depth = max_depth,
              n_estimators = n_estimators,
              random_state  = 0 )

%time trained_RF = model.fit ( X_train, y_train )

predictions = model.predict ( X_test )

cu_score = cuml.metrics.accuracy_score( y_test, predictions )
sk_score = accuracy_score( asnumpy( y_test ), asnumpy( predictions ) )

Above 6000, I got the below CUDA error and kernel crashes. Note that:

  • increasing n_features from 10 to 5000 with n_samples = 5000 runs perfectly well.. so it seems to be an issue with the number of rows of the dataset, not columns
  • tested on the 2 GPUs available on the machine (GTX 1050 2GB)
  • nvidia-smi shows under 25% of GPU memory usage during the run
  • cuda v11.2
  • driver version: 460.73.01
  • ubuntu 18

Any help is greatly appreciated.

The CUDA error:

RuntimeError Traceback (most recent call last) in ~/anaconda3/envs/rapids/lib/python3.8/site-packages/cuml/internals/api_decorators.py in inner_with_setters(*args, **kwargs) 408 target_val=target_val) 409 --> 410 return func(*args, **kwargs) 411 412 @wraps(func) cuml/ensemble/randomforestclassifier.pyx in cuml.ensemble.randomforestclassifier.RandomForestClassifier.fit() RuntimeError: CUDA error encountered at: file=/opt/conda/envs/rapids/conda-bld/libcuml_1614210250760/work/cpp/src/decisiontree/quantile/quantile.cuh line=150: call='cub::DeviceRadixSort::SortKeys( (void *)d_temp_storage->data(), temp_storage_bytes, &d_keys_in[batch_offset], d_keys_out->data(), n_sampled_rows, 0, 8 * sizeof(T), tempmem->stream)', Reason=cudaErrorInvalidValue:invalid argument Obtained 64 stack frames #0 in /home/oleg/anaconda3/envs/rapids/lib/python3.8/site-packages/cuml/common/../../../../libcuml++.so(_ZN4raft9exception18collect_call_stackEv+0x46) [0x7fa9b83eef36] #1 in /home/oleg/anaconda3/envs/rapids/lib/python3.8/site-packages/cuml/common/../../../../libcuml++.so(_ZN4raft10cuda_errorC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x69) [0x7fa9b83ef699] #2 in /home/oleg/anaconda3/envs/rapids/lib/python3.8/site-packages/cuml/common/../../../../libcuml++.so(_ZN2ML12DecisionTree19preprocess_quantileIfiEEvPKT_PKjiiiiSt10shared_ptrI15TemporaryMemoryIS2_T0_EE+0xaaf) [0x7fa9b84fea7f] #3 in /home/oleg/anaconda3/envs/rapids/lib/python3.8/site-packages/cuml/common/../../../../libcuml++.so(_ZN2ML12rfClassifierIfE3fitERKN4raft8handle_tEPKfiiPiiRPNS_20RandomForestMetaDataIfiEE+0xde3) [0x7fa9b8734b63] #4 in /home/oleg/anaconda3/envs/rapids/lib/python3.8/site-packages/cuml/common/../../../../libcuml++.so(_ZN2ML3fitERKN4raft8handle_tERPNS_20RandomForestMetaDataIfiEEPfiiPiiNS_9RF_paramsEi+0x1fd) [0x7fa9b872f54d] #5 in /home/oleg/anaconda3/envs/rapids/lib/python3.8/site-packages/cuml/ensemble/randomforestclassifier.cpython-38-x86_64-linux-gnu.so(+0x3c7e5) [0x7fa98e6d97e5] #6 in /home/oleg/anaconda3/envs/rapids/bin/python(PyObject_Call+0x255) [0x5589964052b5] #7 in /home/oleg/anaconda3/envs/rapids/bin/python(_PyEval_EvalFrameDefault+0x21c1) [0x5589964b1de1] #8 in /home/oleg/anaconda3/envs/rapids/bin/python(_PyEval_EvalCodeWithName+0x2c3) [0x558996490503] #9 in /home/oleg/anaconda3/envs/rapids/bin/python(+0x1b2007) [0x558996492007] #10 in /home/oleg/anaconda3/envs/rapids/bin/python(_PyEval_EvalFrameDefault+0x4ca3) [0x5589964b48c3] #11 in /home/oleg/anaconda3/envs/rapids/bin/python(_PyEval_EvalCodeWithName+0x2c3) [0x558996490503] #12 in /home/oleg/anaconda3/envs/rapids/bin/python(PyEval_EvalCodeEx+0x39) [0x558996491559] #13 in /home/oleg/anaconda3/envs/rapids/bin/python(PyEval_EvalCode+0x1b) [0x5589965349ab] #14 in /home/oleg/anaconda3/envs/rapids/bin/python(+0x2731de) [0x5589965531de] #15 in /home/oleg/anaconda3/envs/rapids/bin/python(+0x128d4b) [0x558996408d4b] #16 in [ ..... removed for readability ] [0x5589964b1de1] #54 in /home/oleg/anaconda3/envs/rapids/bin/python(_PyEval_EvalCodeWithName+0x2c3) [0x558996490503] #55 in /home/oleg/anaconda3/envs/rapids/bin/python(+0x1b2007) [0x558996492007] #56 in /home/oleg/anaconda3/envs/rapids/bin/python(_PyEval_EvalFrameDefault+0x1782) [0x5589964b13a2] #57 in /home/oleg/anaconda3/envs/rapids/bin/python(+0x1925da) [0x5589964725da] #58 in /home/oleg/anaconda3/envs/rapids/bin/python(+0x128d4b) [0x558996408d4b] #59 in /home/oleg/anaconda3/envs/rapids/bin/python(+0x13b3ea) [0x55899641b3ea] #60 in /home/oleg/anaconda3/envs/rapids/bin/python(+0x21da4f) [0x5589964fda4f] #61 in /home/oleg/anaconda3/envs/rapids/bin/python(+0x128fc2) [0x558996408fc2] #62 in /home/oleg/anaconda3/envs/rapids/bin/python(_PyEval_EvalFrameDefault+0x92f) [0x5589964b054f] #63 in /home/oleg/anaconda3/envs/rapids/bin/python(_PyEval_EvalCodeWithName+0x2c3) [0x558996490503]

Oleg
  • 161
  • 1
  • 14
  • I ran the code in your question on Google Colab with cuml 0.19.0 and n_samples=100000 and it ran to completion without error in 50ms. So this is either specific to whatever cuml version you have, or a problem with what you are running it on. – talonmies Jun 04 '21 at 02:13
  • It's a fresh Ubuntu 20.04 install with just cuda package installed (toolkit + drivers), then conda then rapids install through conda. Tried with different drivers versions and cuda versions but no luck.. Tried on collab as well but need to reinstall rapids every time you launch a notebook or create a GCS instance with RAPIDS but at the moment there is no GPU resources available anywhere on GCS.. if you have any alternative I'm open.. Thanks – Oleg Jun 07 '21 at 14:08
  • Unfortunately I don't. Maybe file a bug report. The thing is that it isn't your code. It is your system. [SO] isn't really the place to get help with that – talonmies Jun 07 '21 at 14:33
  • Yes I filled one but same inputs, I'll try on some newer hardware, thanks! – Oleg Jun 08 '21 at 19:59

1 Answers1

1

Found out that the issue is related to the use of the experimental backend for RF in cuML - so setting split_algo = 0 in the cuRF configuration solves the issues by falling back on the default backend. At the time of writing this is 3 times slower than using the experimental backend..

Oleg
  • 161
  • 1
  • 14