I have a problem of imbalanced classes and small dataset :
0 : 142
1 : 29
I try to find the right method to deal with this issue and the best algorithm.
For now the best results I have came from using a combination of oversampling with SMOTE and undersampling with RandomUnderSampler. And then using ClassificationTree from interpretML.
I achieve a score of 0.88 and a not too bad confusion matrix but I need better results.
0 1
0 27 3
1 4 23
I need to improve the score and to have better predictions
Here is my code :
oversample = SMOTE()
X_over, y_over = oversample.fit_resample(X, y)
under = RandomUnderSampler()
X_ovun, y_ovun=under.fit_resample(X_over, y_over)
seed = 1
X_train, X_test, y_train, y_test = train_test_split(X_ovun, y_ovun, test_size=0.20, random_state=seed)
ct = ClassificationTree(random_state=seed)
ct.fit(X_train, y_train)
ct.score(X_test, y_test)
Any advice to improve the results will be welcomed !