I'm currently trying to run an XGBOOST model locally on Jupyter notebook. I have a dataset with shape (68799, 85). The idea is to use Optuna to conduct hyperparameter tuning. However, the kernel on Jupyter notebook dies with the message The kernel appears to have died. It will restart automatically.
The notebook dies when I convert my train, test and validation sets to the DMatrix
data type before any model fitting/tuning begins:
features = df.drop(["Target"], axis = 1) # X
target = df["Target"] # Y
train_ratio = 0.8
val_ratio = 0.1
test_ratio = 0.1
# Creating train and test sets
X_train, X_test, y_train, y_test = train_test_split(features,
target,
test_size = 1 - train_ratio,
random_state = 42)
# Creating validation set
X_val, X_test, y_val, y_test = train_test_split(X_test,
y_test,
test_size = test_ratio/(test_ratio + val_ratio),
random_state = 42)
# Converting to DMatrix
d_train = xgb.DMatrix(X_train,
label = y_train)
d_test = xgb.DMatrix(X_test,
label = y_test)
d_val = xgb.DMatrix(X_val,
label = y_val)
I will eventually provision EC2 resources on AWS but I wanted to check whether Jupyter is crashing because I have a large dataset or whether there is some underlying issues with Jupyter or even my machine:
Processor: 2.6 GHz 6-Core Intel Core i7
Memory: 16 GB 2667 MHz DDR4
Graphics: AMD Radeon Pro 5300M 4 GB Intel UHD Graphics 630 1536 MB
Has anyone come across such an issue before? It just seems odd that the kernel dies before the model is even training.