I'm training a sentence-pair binary classification model using Roberta but the model is not able to learn the positive class (class with label 1). My dataset is imbalanced such that:
training data -
0 --- 140623
1 --- 5537
validation data -
0 --- 35156
1 --- 1384
The training results in 0 true positives and 0 false positives on validation data. During the evaluation, I calculate macro F1, but how to take care of class imbalance during training? Several articles mentioned that BERT takes care of imbalance itself. But that doesn't seem to happen in my case.
I am using this dataset.
Any help is appreciated.