I'm doing some hyper-parameter tuning, so speed is key. I've got a nice workstation with both an AMD Ryzen 9 5950x
and an NVIDIA RTX3060ti 8GB
.
Setup:
xgboost
1.5.1
usingPyPi
in ananaconda
environment.NVIDIA
graphics driver471.68
CUDA
11.0
When training a xgboost
model using the scikit-learn
API I pass the tree_method
= gpu_hist
parameter. And i notice that it is consistently outperformed by using the default tree_method
= hist
.
Somewhat surprisingly, even when I open multiple consoles (I work in spyder
) and start an Optuna
study
in each of them, each using a different scikit-learn
model until my CPU usage is at 100%
. When I then compare the tree_method
= gpu_hist
with tree_method
= hist
, the tree_method
= hist
is still faster!
How is this possible? Do I have my drivers configured incorrectly?, is my dataset too small to enjoy a benefit from the tree_method
= gpu_hist
? (7000 samples, 50 features on a 3 class classification problem). Or is the RTX3060ti
simply outclassed by the AMD Ryzen 9 5950x
? Or none of the above?
Any help is highly appreciated :)
Edit @Ferdy: I carried out this little experiment:
def fit_10_times(tree_method, X_train, y_train):
times = []
for i in range(10):
model = XGBClassifier(tree_method = tree_method)
start = time.time()
model.fit(X_train, y_train)
times.append(time.time()-start)
return times
cpu_times = fit_10_times('hist', X_train, y_train)
gpu_times = fit_10_times('gpu_hist', X_train, y_train)
print(X_train.describe())
print('mean cpu training times: ', np.mean(cpu_times), 'standard deviation :',np.std(cpu_times))
print('all training times :', cpu_times)
print('----------------------------------')
print('mean gpu training times: ', np.mean(gpu_times), 'standard deviation :',np.std(gpu_times))
print('all training times :', gpu_times)
Which yielded this output:
mean cpu training times: 0.5646213531494141 standard deviation : 0.010005875058323703
all training times : [0.5690040588378906, 0.5500047206878662, 0.5700047016143799, 0.563004732131958, 0.5570034980773926, 0.5486617088317871, 0.5630037784576416, 0.5680046081542969, 0.57651686668396, 0.5810048580169678]
----------------------------------
mean gpu training times: 2.0273998022079467 standard deviation : 0.05105794761358874
all training times : [2.0265607833862305, 2.0070691108703613, 1.9900789260864258, 1.9856727123260498, 1.9925382137298584, 2.0021069049835205, 2.1197071075439453, 2.1220884323120117, 2.0516715049743652, 1.9765043258666992]
The peak in CPU usage refers to the CPU training runs, and the peak in GPU usage the GPU training runs.