I am using the rapidsai docker container as obtained via
docker pull rapidsai/rapidsai:cuda10.0-runtime-ubuntu18.04
docker run --gpus all --rm -it -p 8888:8888 -p 8787:8787 -p 8786:8786 \
rapidsai/rapidsai:cuda10.0-runtime-ubuntu18.04
and have started it via
docker run --memory=30g --cpus=12 --gpus all --rm -it \
-p 8888:8888 -p 8787:8787 -p 8786:8786 \
rapidsai/rapidsai:cuda10.0-runtime-ubuntu18.04-py3.6
When I run the random_forest_mnmg_demo
via JupyterLab, I get the following accuracies
SKLearn accuracy: 0.867
CuML accuracy: 0.833
While the notebook says that
Due to randomness in the algorithm, you may see slight variation in accuracies
I would not call this difference "slight".
As a side note: I have also tested and modified the other RF notebook (random_forest_demo
) and observed accuracy differences as large as 0.95 vs 0.75 (for different data set sizes and RF parameters). According to the cuML
documentation, the cuML
node split algorithm is different from sklearn
. Therefore, I have changed split_algo = 0
and tried various n_bins
values - without success. I have also tested h2o
s RF implementation on random_forest_demo
and h2o
and sklearn
give very similar results most of the time.
There is a similar question on SO, but it seems that this issue was related to cuML
version 0.12 and should have been fixed in version 0.14, which I am using. So there must be something else going on.
I have compared the sklearn
and cuML
parameter settings for RF and I think they should be close enough to produce similar results. Did I miss some configuration settings? Or might this be hardware related?
nvidia-smi
output (executed on host machine, GPU is "GeForce GTX 1050 Ti with Max-Q Design")
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.36.06 Driver Version: 450.36.06 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX 105... On | 00000000:01:00.0 Off | N/A |
| N/A 64C P0 N/A / N/A | 1902MiB / 4042MiB | 8% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
Cuda version as given by nvcc --version
Cuda compilation tools, release 10.0, V10.0.130