could you help me with this Python CuML error
UnownedMemory requires explicit device ID for a null pointer.
I am doing CuML Random Forest cross validation. At each step in the for-loop, I compile all the pandas dataframes except one, train a random forest on this combined dataset, and check the error on the left-out dataset. After doing this on the first three datasets, I get the error above. Not sure what I am doing wrong in terms of memory management. Each dataset is quite big, 2 GB. There are 29 datasets. But the GPU is able to do three rounds of training with 28 datasets without any issues. I have python 3.9 and Cupy 10.6.0.
import cudf
import cuml
import cupy
import pandas as pd
import numpy as np
import os
import time
def cross_validation(dfs):
try:
frac = 0.4
print('Cross validation on', len(dfs), 'datasets.')
names = dfs.keys() #dfs is a dictionary of dataset names : datasets
#Iterate over each dataframe, to keep it out.
for nam in names:
#Keep out the dataset corresponding to nam.
print('Leaving',nam,'out.')
start = time.time()
mdfs = dfs.copy()
mdfs.pop(nam)
#Have the odd dataset ready for testing.
df = dfs[nam]
#Compile, train, and test.
train = pd.DataFrame()
for mdf in mdfs:
train = pd.concat([train, mdfs[mdf].sample(frac = frac)], axis =0, ignore_index = True)
print('Training now.')
y_train = train['ActualError']
train = train.drop(columns=['ActualError'])
regr = cuml.RandomForestRegressor(n_estimators=20, min_samples_leaf=200)
regr.fit(train, y_train)
test_features = df
prediction = regr.predict(test_features)
error = np.mean(np.abs(test['copdem'] - prediction -test['bench']))
print('survived', cupy.get_default_memory_pool().used_bytes(), cupy.get_default_memory_pool().total_bytes())
return 0
except Exception as e:
print(e)
return -1
The used bytes and total bytes are always 0. I have tried deleting the Random Forest, freeing all bytes in the default memory pool, different memory allocators, but nothing has helped.