Loss and CER increases when training a fine-tuned TrOCR on custom dataset

Question

I want to train the TrOCR on my custom dataset of receipts. Since we are to use the OCR for receipts we chose the “printed” fine-tuned model. We use a dataset of 5000 bounding boxes where each contains a word. However we experience that all metrics (cer, precision) and loss worsens for each epoch we run. We can't figure out why the model performs more poorly for each epoch.

Processor, model and optimizer is shown below:

processor = TrOCRProcessor.from_pretrained("microsoft/trocr-base-printed")
model = VisionEncoderDecoderModel.from_pretrained("microsoft/trocr-base-printed")
optimizer = optim.AdamW(model.parameters(), lr=5e-5)

Training method:

for epoch in range(self.epochs):
    self.model.train()
    train_loss = 0.0
    for batch in tqdm(self.train_dataloader):
        for k, v in batch.items():
            batch[k] = v.to(self.device)
        outputs = self.model(**batch)
        loss = outputs.loss
        loss.backward()
        self.optimizer.step()
        self.optimizer.zero_grad()
        train_loss += loss.item()

Does anyone know what we might be doing wrong?

When evaluating the model out-of-the-box, it performs well but we wish to continue training on our receipts to hopefully improve the model.

@greybeard will do, first time asking question here :) – SauceyP May 10 '23 at 16:20 — SauceyP, May 10 '23 at 16:20

score 0 · Answer 1 · answered May 19 '23 at 09:10

you have chosen the already fine tuned model. "microsoft/trocr-base-printed" is already fine tuned on SROIE dataset. So there is no point of fine tuning already fine tuned model. Instead choose only pre trained model like trocr-base-stage1 or trocr-small-stage1.

Loss and CER increases when training a fine-tuned TrOCR on custom dataset

1 Answers1