I'm working on a convolutional neural network to classify an image dataset with binary labels (either 0 or 1). In training the network, each epoch ends up having zero false negatives, does that mean my network is just classifying everything as 1 and not even bothering to match the 0s? If so, how can I combat this? The dataset is uneven, but there are more 0s. For the training set, the ratio of 0:1 is about 8000:5000, and for validation 700:500.
Asked
Active
Viewed 364 times
0
-
"*What does it mean when my CNN has zero false negatives?*" I would assume it means you did a good job. Now you can go through each item in the dataset and find out if it was accurate or not. – TylerH Dec 21 '20 at 15:36
-
1Could be. How many false positives are there? – phuzi Dec 21 '20 at 15:39
-
Yeah that's the thing, there is a lot of false positives (thousands). Accuracy is about 40 %. – thisisme Dec 21 '20 at 15:45
1 Answers
0
Having zero false negative sounds pretty suspicious. What is your accuracy? How does the confusion matrix look like? Anyway, I would recommend to introduce class weights for imbalanced training data

Stephan Boner
- 733
- 1
- 6
- 27
-
The accuracy is only about 40 % and there is a lot of false positives. Thank you for the class weights suggestion, that's a good idea. – thisisme Dec 21 '20 at 15:46
-
Ok, I assume that something is wrong then. How big are the images? What activation function do you use for the classification? Which loss? What does your network architecture look like? I guess we can only help you if we know these things... – Stephan Boner Dec 21 '20 at 16:32
-
It seems so, yeah. The images are all being resized to 320x320 (I'm generating them with flow from data frame), batch size is 64, I'm using binary cross entropy as the loss function and softmax as the activation function and the architecture is VGG (all its layers are trainable=false) and then its output gets flattened, put through 3 consecutive Dense(128) layers and then the Dense(1) output layer. These added ones are trainable – thisisme Dec 21 '20 at 16:37
-
doesn't sound too bad, but why don't you let the CNN layers be trainable? – Stephan Boner Dec 21 '20 at 16:42
-
I figured that if I'm using a pre-trained VGG model, then I only need to train the bits I add onto it after. Maybe this isn't conceptually correct, this is my first encounter with transfer learning. Do you think I should let all layers train or maybe just some of the latter VGG layers plus my own layers? Although, I have tried that and there were still zero false negatives. – thisisme Dec 21 '20 at 16:44
-
I mean the general idea makes sense, yeah, but honestly I don't know if this works and since you have a low accuracy, this might be an approach. But probably it takes ages to train? How many images do you have? 13k? Depends as well on your infrastructure of course – Stephan Boner Dec 21 '20 at 16:49
-
I accidentally gave a ratio instead of the actual number, there are in fact just over 30 000 images where about 15 000 are '1' and 22 000 are '0'. Maybe I'm just being impatient and not giving it enough time to train, that seems reasonable. – thisisme Dec 21 '20 at 16:59
-
how many epochs do you have? How does your train/validation accuracy develop? – Stephan Boner Dec 21 '20 at 17:09
-
The training runs very slowly, so I only had about 15 epochs. I'm training on Google Colab and one epoch was usually 7 minutes (if I make the whole model trainable, it becomes about 17 minutes per epoch). The accuracies don't really change much, they stay very low at about 40 % and usually move only by few percent up or down. – thisisme Dec 21 '20 at 17:12
-
15 epochs is by far not enough. I could imagine that it just takes a while... are your pictures hard to distinguish? And is the training and test accuracy about 40%? – Stephan Boner Dec 22 '20 at 07:47
-
I ended up leaving it for 60 epochs and accuracy remained on 40 % and zero false negatives :/ The pictures are quite hard, they are x-ray scans where 0 are scans without problems and 1 are scans with some abnormality. I think I will need to do proper preprocessing on the images and try to add weights to the classes. – thisisme Dec 22 '20 at 12:17
-
oh okay, yeah that sounds quite tough. Are you sure about the architecture? Is this commonly used for these type of images? – Stephan Boner Dec 22 '20 at 13:10