6

example image

I have a dataset in which i need to classify certain set of labelled images. Currently this is done by human experts so the dataset available is of good quality. The images have some very similar features.

To state an example, we could assume that the amount of rotting of an apple or tomato is been categorized as very low, low, medium, high and very high (5 classes) and very similar images exist in the adjacent class pairs viz; (very low & low, low & medium, medium & high, high and very high)

Is there a way to overcome this?

Seems challenging since differentiating between the adjacent classes is very complex and confusing since a very similar image exist in more than one class.

Niranjan A
  • 95
  • 2
  • 8
  • Have you tried anything or are you just speculating? – BlackBear Feb 25 '19 at 07:45
  • Similar images are obviously harder to differentiate than image classes that have highly different features. I'm afraid there is no magical solution to your problem, a hard classification task will be hard to solve regardless of which features or classifier you choose. – T A Feb 25 '19 at 09:24
  • @BlackBear I tried using a ResNet50 by freezing the base model architecture and only training the Dense layers that i have added below(tried with BatchNorm and Dropout as well) , using an SGD optimizer(tried different learning rates ranging from 0.1 to 0.001). The output was that the training accuracy and training loss increased well to 78% but the validation accuracy just stopped increasing after 53%. – Niranjan A Feb 25 '19 at 09:42
  • @TA I understand the fact that it is obviously very difficult, But i only wanted to understand if at the present moment is it just difficult or virtually impossible? If its just difficult what approach would you suggest? Would segmenting the rotted pixels and then passing it to a classifier would help? – Niranjan A Feb 25 '19 at 09:44
  • 2
    How many images do you have? The more the better. Maybe you can reduce the number of classes. Also try to use ordinal regression, e.g. https://www.kaggle.com/c/diabetic-retinopathy-detection/discussion/13115 – BlackBear Feb 25 '19 at 09:49
  • @BlackBear I have close to 500 images per class. I have a larger dataset that is yet to be labelled which would approximate to almost 20,000 images across classes. By the way, your suggestion sounds really appealing and intuitive. Any useful links to help me implement Ordinal classification for image datasets. All implementations that i could find were for tabular data from CSV files. – Niranjan A Feb 25 '19 at 11:02
  • 2
    You should see a significant improvement with 40x the data :) as for the implementation, see e.g. https://github.com/JHart96/keras_ordinal_categorical_crossentropy – BlackBear Feb 25 '19 at 11:34
  • @BlackBear Thanks for sharing. I didn't even know there was something like Ordinal classification. If you know of any good blogs that explain the intuition behind it left me know. I am just going off google and I found a couple: https://www.cs.waikato.ac.nz/~eibe/pubs/ordinal_tech_report.pdf http://www.cs.uu.nl/docs/vakken/b3dar/ordinal-dict.pdf – Moondra Feb 25 '19 at 19:09
  • @BlackBear you just opened a window in a place where i was stuck. Thanks a lot Sir. – Niranjan A Feb 26 '19 at 08:28
  • @NiranjanA glad to hear :) good luck! – BlackBear Feb 26 '19 at 08:56

4 Answers4

0

I recommend you to use only one score value (e.g. 0.0~1.0) as "rotten score" and binning the value based on the rot score result.

For example: 0.0~0.2 : very low rot 0.2~0.4 : low rot 0.4~0.6 : medium rot 0.6~0.8 : high rot 0.8~1.0 : very high rot

Derza Arsad
  • 41
  • 1
  • 2
0

CNNs are learning trivial features while FC layers are diving even deeper. This u need to add deeper layers for better accuracy in this case

0

Visual similarity is not necessarily a problem. Generally speaking, if humans fail to classify than CNNs fail probably, too. As a rough indication for difficulty you might ask a friend to classify a few difficult samples, if she/he struggles (or you yourself) it might be virtually impossible. I don't think that the example you show is extremely difficult, but I could see that also humans have difficulties into separating the classes well and a look at the data and its labeling might reveal this.

Dharman
  • 30,962
  • 25
  • 85
  • 135
J.T.
  • 194
  • 5
0

You can do this easily by using a Convolutional Neural Network.
Use some Conv2D blocks to extract features from images and then use fully connected Dense layers at the end.
As you have 5 classes to predict, a softmax activation function at the final layer would be good enough.
And normalise the image pixel values by dividing by 255.0, to train faster.
And the labels should be integer-encoded.

model = Sequential()
model.add(Conv2D(32,(3,3),activation='relu',padding='same'))
model.add(Conv2D(32,(3,3),activation='relu',padding='same'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(64,(3,3),activation='relu',padding='same'))
model.add(Conv2D(64,(3,3),activation='relu',padding='same'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dropout(0.15))
model.add(Dense(100, activation='relu'))
model.add(Dropout(0.15))
model.add(Dense(50,activation='relu'))
model.add(Dense(5,activation='softmax'))

Then compile the model with sparse_categorical_crossentropy loss function

model.compile(optimizer=Adam(0.001),
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

Try experimenting with the learning rate, or the model hyperparameters to achieve good results

MD Mushfirat Mohaimin
  • 1,966
  • 3
  • 10
  • 22