1

I've been training an image classification model using object detection and then applying image classification to the images. I have 87 custom classes in my data(not ImageNet classes), and just over 7000 images altogether(around 60 images per class). I am happy with my object detection code and I think it works quite well, however, for classification I have been using ResNet and AlexNet. I have tried AlexNet, ResNet18, ResNet50 and ResNet101 for training however, I am getting very low testing accuracies(around 10%), and my training accuracies are high for all models. I've also attempted regularisation and changing the learning rates, but I am not getting the higher accuracies(>80%) that I require. I wonder if there is a bug in my code, although I haven't been able to figure it out.

Here is my training code, I have also processed images in the way that Pytorch pretrained models expect:

import torch.nn as nn
import torch.optim as optim
from typing import Callable
import numpy as np

EPOCHS=100

resnet = torch.hub.load('pytorch/vision:v0.10.0', 'resnet50')
resnet.eval()

        
resnet.fc = nn.Linear(2048, 87)
res_loss = nn.CrossEntropyLoss() 
res_optimiser = optim.SGD(resnet.parameters(), lr=0.01, momentum=0.9,  weight_decay=1e-5)

def train_model(model, loss_fn, optimiser, modelsavepath):

    train_acc = 0

    for j in range(EPOCHS): 
        running_loss = 0.0
        correct = 0
        total = 0

        for i, data in enumerate(training_generator, 0):
            model.train()
            inputs, labels, paths = data
            total += 1
            optimizer.zero_grad()
            outputs = model(inputs)
            _, predicted = torch.max(outputs, 1)
            if(predicted.int() == labels.int()):
                correct += 1
            loss = loss_fn(outputs, labels)
            loss.backward()
            optimizer.step()
            running_loss += loss.item()

        train_acc = train_correct / len(training_generator)

        print("Epoch:{}/{} AVG Training Loss:{:.3f} AVG Training Acc {:.2f}% ".format(j + 1, EPOCHS, train_loss, train_acc))

    torch.save(model, modelsavepath)

train_model(resnet, res_loss, res_optimiser, 'resnet.pth')

Here is the testing code used for a single image, it is part of a class:

self.model.eval()
outputs = self.model(img[None, ...])  #models expect batches, so give it a singleton batch
scores, predictions = torch.max(outputs, 1)
predictions = predictions.numpy()[0]
possible_scores= np.argmax(scores.detach().numpy())

Is there a bug in my code, either testing or training, or is my model just overfitting? Additionally, is there a better image classification model that I could try?

Sire
  • 13
  • 3
  • How did you pre-process the images on training? If you normalize the training set but don't do it when predict that would be the problem. – Natthaphon Hongcharoen Aug 24 '22 at 12:38
  • Yes, I did normalize both the training and test set. – Sire Aug 24 '22 at 13:38
  • 2
    If you get high training accuracy (~100%) and poor test accuracy, you're facing an overfitting problem. I recommend using smaller network (especially smaller dimension in fc layer) and stronger data augmentation. – Hayoung Aug 25 '22 at 08:09

1 Answers1

1

Your dataset is very small, so you're most likely overfitting. Try:

  1. decrease learning rate (try 0.001, 0.0001, 0.00001)
  2. increase weight_decay (try 1e-4, 1e-3, 1e-2)
  3. if you don't already, use image augmentations (at least the default ones, like random crop and flip).
  4. Watch train/test loss curves when finetuning your model and stop training as soon as you see test accuracy going down while train accuracy goes up.
MichaelSB
  • 3,131
  • 3
  • 26
  • 40