0

mates. I was supposed to try to deal this data after oversampling, so I chose SMOTE, ADASYN and borderlinesmote to figure out which sampling method is the best.

but the thing is .. when I applied those three sampler, seems they are creating exact same number of synthetic instances. the shape of the train/test is same and also confusion matrix is same as well.

Is there something that Im missing? thanks.

** I used the this data below https://www.kaggle.com/datasets/andrewmvd/fetal-health-classification

`from imblearn.over_sampling import BorderlineSMOTE
borderline_smote = BorderlineSMOTE(random_state=42)
X_train_border, y_train_border = smote.fit_resample(X_train, y_train)
ovr_clf = OneVsRestClassifier(SVC(kernel='linear', random_state=42))
ovr_clf.fit(X_train_border, y_train_border)
y_pred = ovr_clf.predict(X_test) 


from imblearn.over_sampling import ADASYN

adasyn = ADASYN(random_state=42)
X_train_adasyn, y_train_adasyn = smote.fit_resample(X_train, y_train)
ovr_clf = OneVsRestClassifier(SVC(kernel='linear', random_state=42))
ovr_clf.fit(X_train_adasyn, y_train_adasyn)
y_pred = ovr_clf.predict(X_test) 

print('Accuracy Score:'"{:.2f}%".format(accuracy_score(y_pred, y_test)*100))
print('report' +classification_report(y_test, y_pred))`
Nini
  • 25
  • 3

0 Answers0