I'm creating a neural network in pytorch that takes an 250x260 image in 3 channels as an input, and returns a pixel by pixel classification (4 classes), it returns 4 channels of 250x260 images representing each class. This is the neural network:
CNN(
(conv1): Conv2d(3, 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(norm): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
(conv2): Conv2d(8, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(linear_1): Linear(in_features=64480, out_features=150, bias=True)
(linear_4): Linear(in_features=150, out_features=64480, bias=True)
(flatten): Flatten(start_dim=1, end_dim=-1)
(t_conv1): ConvTranspose2d(16, 8, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(t_conv2): ConvTranspose2d(8, 4, kernel_size=(3, 3), stride=(2, 2), padding=(0, 1), output_padding=(1, 1))
)
However, i'm using cross entropy loss function, and it always returns a loss of 0.001 or 0.000 but with predictions completely wrong, e.g:
The first image is the ground truth, there are 4 classes in total (Yellow = class 3, green = class 2, light blue = class 1, purple = class 0). The second image is the neural network prediction in the first epoch of the NN training, i understand that the first epoch it's not that accurate, but this happens even after 50 epochs, and i truly don't understand why the Loss function returns 0.000 if the prediction it's clearly different from the ground truth. This is the code of the training:
for epoch in range(50): # loop over the dataset multiple times
running_loss = 0.0
for i, data in enumerate(trainloader, 0):
# get the inputs; data is a list of [inputs, labels]
model.train()
inputs, labels = data
# zero the parameter gradients
optimizer.zero_grad()
# forward + backward + optimize
outputs = model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
print(f'[{epoch + 1}, {i + 1:5d}] loss: {running_loss / 2000:.3f}')
running_loss = 0.0
Do you know why loss function thinks the predictions are accurate?