1

I'm working on a multimodal classifier (text + image) using pytorch (only 2 classes).

Since I don't have a lot of data, i've decided to use StratifiedKFold to avoid overfitting.

I noticed a strange behavior on training/testing curves.

enter image description here

My training accuracy quickly converges forward a unique value for few epochs before evolving again.

With these results I directly thought of overfitting, .67 being the maximum accuracy of the model.

With the rest of the data separated by the KFold, I tested my model in evaluation mode.

enter image description here

I've been quite surprised since test accuracy follows (quite exactly) the training accuracy while the loss (CrossEntropyLoss) still evolves.

Note : changing the batch size only make growing of accuracy delays or brings closer the moment the loss evolves.

Any ideas about this behaviour ?

0 Answers0