I am confused with three different f1 computation. Which f1 scoring I should use for a severely imbalanced data? I am working on a severely imbalanced binary classification.
‘f1’
‘f1_micro’
‘f1_macro’
‘f1_weighted’
Also, I want to add balanced_accuracy_score(y_true, y_pred, adjusted=True)
in balanced_accuracy
scoring argument. How can I incorporate this in my code?
from sklearn.model_selection import cross_validate
from sklearn.metrics import make_scorer
from sklearn.datasets import load_breast_cancer
from sklearn.linear_model import LogisticRegression
from imblearn.metrics import geometric_mean_score
X, y = load_breast_cancer(return_X_y=True)
gm_scorer = make_scorer(geometric_mean_score, greater_is_better=True)
scores = cross_validate(LogisticRegression(max_iter=100000),X,y, cv=5,scoring={'gm_scorer': gm_scorer, 'F1': 'f1', 'Balanced Accuracy': 'balanced_accuracy'}
)
scores