4

I had a dataset of 15k records. I trained the model using a k-train package and 'bert' model with 5k samples. The train-test split is 70-30% and test results gave me accuracy and f1 scores as 93-94%. I felt the model is well trained, But on evaluating with the remaining 10k records of my dataset to the trained model. Around 10% of result samples are getting predicted to the wrong label with higher confidence. I want to process the text to the next level if the confidence is high. If it's mapping to the wrong label with higher confidence there is no way to say this model is good. How to handle or what kind of techniques to applied to bring correct predictions here?

0 Answers0