1

I am using python 3 with nvidia Rapids in order to speed up machine learning training using cuml library and a GPU.

My scrips also uses keras with GPU training (over tf) and when I reach the stage where I try to use CUML I get memory error. I suspect that this is happening because TF does not release the GPU memory (looking at nvidia-smi) I see that all the memory is allocated.

This is the code I use to train the cuml model

import cuml
from cuml import LinearRegression
lr = LinearRegression()
lr.fit(encoded_data, y_train)

this is the error I get

[2] Call to cuMemAlloc results in CUDA_ERROR_OUT_OF_MEMORY

encoded_data and y_train are numpy arrays, encoded_data is n*m array of floats, and y_train is n*1 vector of integers that are labels, both are working fine when training with sklearn Logistic regression.

How can I either: 1.Use the same GPU (preferred) without loosing all the tf models I trained (I have more memory then the tf model takes in practice, But the tf process is still taking all the memory) 2.Use my second GPU for the CUML calculations (I can't find a way to select which GPU to run the RAPIDS CUML model training.

desertnaut
  • 57,590
  • 26
  • 140
  • 166
thebeancounter
  • 4,261
  • 8
  • 61
  • 109

1 Answers1

1

I'm going to answer #2 below as it will get you on your way the fastest. It's 3 lines of code. For #1, please raise an issue on RAPIDS Github or ask a question on our slack channel.

First, run nvidia-smi to get your GPU numbers and to see which one is getting its memory allocated to keras. Here's mine:

nvidia-smi
Fri Jun 28 16:50:06 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.39       Driver Version: 418.39       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro GV100        Off  | 00000000:15:00.0 Off |                  Off |
| 29%   40C    P2    26W / 250W |  32326MiB / 32478MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Quadro GV100        Off  | 00000000:2D:00.0  On |                  Off |
| 33%   46C    P0    29W / 250W |    260MiB / 32470MiB |     26%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

Here, there is GPU #0 and GPU #1. GPU #0 has its memory well used. If I want to run something else in RAPIDS, I'll need to use GPU #1

import os
# Select a particular GPU to run the notebook 
os.environ["CUDA_VISIBLE_DEVICES"]="1" # or replace '1' with which GPU you want to use if you 

Then run the rest of your code.

Please lmk if this helps or if you need further assistance

TaureanDyerNV
  • 1,208
  • 8
  • 9