3

In an image classification task of around 15 classes, the classes are rathe simple, categorized by color. Not some very high level detail.

The graphs of loss and accuracy over 100 epochs enter image description hereenter image description here

I am using the below simple CNN model consisting of 2 convolutions and 3 hidden layers:

Summary

SimNet1(
  (conv1): Sequential(
    (0): Conv2d(3, 4, kernel_size=(5, 5), stride=(1, 1))
    (1): ReLU()
    (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (3): Conv2d(4, 6, kernel_size=(5, 5), stride=(1, 1))
    (4): ReLU()
    (5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (mlp1): Sequential(
    (0): LazyLinear(in_features=0, out_features=120, bias=True)
    (1): ReLU()
    (2): Linear(in_features=120, out_features=60, bias=True)
    (3): ReLU()
    (4): Linear(in_features=60, out_features=13, bias=True)
  )
)

Using SGD optimizer of learning rate 0.02 and momentum 0.9, with step learning rate scheduler of stepsize=7, gamma=0.1

It appears the training loss stays around 2.53 for over 50 epochs, each epoch 10 batches. The accuracy stays at 0.153, not much fluctuating. The validation also stays around 2.53, and the accuracy similar to the training loss.

On the other hand, I have tried using ResNet18 and modifying the final layer to match the number of output classes. The training accuracy converges to about 0.6, much higher than this simple model. The validation accuracy around 0.5. I have switched to the simple model because there seems to be overfitting using ResNet, the training loss is significantly higher than validation loss.

However, how can one mitigate this problem of loss not improving using the simple model?

Possible attempts:

  • Increase model complexity by increasing the number of hidden units and convolution channel number. Also increase the number of layers of model architecture slightly.

  • Not sure if the weight decay is too strong for the learning rate, and learning rate too low. Initially I used 0.0002 instead of 0.02 as the learning rate, but it stays at the accuracy of 0.04 and not improve. So I adjusted it to higher, still it doesn't improve.

Actual code of model

class SimNet1(nn.Module):
    def __init__(self, conv_out_1, conv_out_2, hid_dim_1, hid_dim_2, num_classes, kernel_size):
        super().__init__()

        # === Start Conv Layers ===
        self.conv1 = nn.Sequential(
            nn.Conv2d(3, conv_out_1, kernel_size),
            nn.ReLU(),
            nn.MaxPool2d(2, 2),
            nn.Conv2d(conv_out_1, conv_out_2, kernel_size),
            nn.ReLU(),
            nn.MaxPool2d(2, 2)
        )

        # === End Conv Layers ===


        # === Start MLP Layers ===
        self.mlp1 = nn.Sequential(
            nn.LazyLinear(hid_dim_1), # if using nn.Linear(), in_dim determined by final conv_out * (image dim after conv)^2
            nn.ReLU(),
            nn.Linear(hid_dim_1, hid_dim_2),
            nn.ReLU(),
            nn.Linear(hid_dim_2, num_classes) # final output dimension matches num of classes
        )

        # === End MLP Layers ===


    def forward(self, x):
        x = self.conv1(x)

        # Flatten tensor except batch
        x = torch.flatten(x, 1)

        x = self.mlp1(x)

        return x

AlgoManiac
  • 99
  • 6
  • 1
    I'm not sure that this is a programming question: it probably belongs in https://datascience.stackexchange.com/ . – Brannon Mar 01 '23 at 15:41

0 Answers0