0

Apologies if this is in the wrong place or formatting is incorrect in advance.

Having an issue that I'm having trouble finding an answer to as I may have it worded incorrectly during my search. I have a model created and working correctly- achieving 91.5% accuracy across 6 classes. Anyways to summarize my issue:

The goal is the classify waste images and the model has to predict what kind of waste it sees. The 6 classes, clear and coloured plastic bottles, clear and coloured plastic bags, cans and glass bottles. My expected results are to retrieve what the model predicts what it sees across the 6 classes, so 67% sure its a coloured bottle, 21% sure its a can etc etc.

The actual results I'm getting is a range of 6 exponential floating point numbers instead, which is not ideal and doesn't really indicate which class they belong to! As for errors I'm not getting any. Is there an issue with the way I've developed the classification code that would prevent more readable results or am I missing something?

I'm using Google Colab as my IDE and my model is DenseNet-201.

Thanks in advance, Jack

Here is the code I'm using to classify my real-world collected data using my trained model. Below this is the code showing the labels assigned into the array of wastes. My problem is I cannot trust these labels to be in the same order I'm receiving the floating point numbers in! Also to note, the images are being looped in from a folder on Google Drive. I have tried individual images but get the same results.

Code for classifying test images

# Morning Test
import numpy as np
from keras.preprocessing import image

width = 100
height = 100

new_dimensions = (width, height)

counter=0


print("Morning Test - Experiment 1 - Clear Bottle \n")

# Morning Test
# Cycle Throgh Images
for x in range (0,10):
  exp1_morning_waste1 = cv2.imread('/content/gdrive/My Drive/Rivers V2/Test Set/New Images/Exp 1/Morn/' + 'MorningBottleClExp1_' + str(x+1) +'.jpg')

  # Check for existence
  if exp1_morning_waste is not None:
    
    # Count the classifications add one
    counter+=1

    # Resize
    exp1_morning_waste = cv2.resize(exp1_morning_waste1, new_dimensions)

    # Add image to array
    exp1_morning_waste = image.img_to_array(exp1_morning_waste)

    # Axis, Dimens
    exp1_morning_waste = np.expand_dims(exp1_morning_waste, axis=0)
    exp1_morning_waste= exp1_morning_waste/255

    # Predict image
    prediction_prob = model.predict(exp1_morning_waste)

    # Print Predictions
    print(f'Probability that image is a: {prediction_prob} ')

    # Image Number
    print("Waste Item No." + str(x+1) +"\n")
    

  # No Directory or image present
  else:
    print("File not Contacted")
    break

Output>> Morning Test - Experiment 1 - Clear Bottle

Probability that image is a: [[9.9152815e-01 1.2046337e-03 1.4043533e-03 5.7380428e-03 6.7023984e-06 1.1799879e-04]] Waste Item No.1

and so on.....

Original Dataset labelling for training the model

# Create dataset and label arrays
wastedata=[]
labels=[]

# Set Random Number generator
random.seed(42)

# Access waste images directory
wasteDirectory = sorted(list(os.listdir("/content/gdrive/My Drive/Rivers V2/Datasets/Waste Dataset - Pre-processing (image resizing 100x100 (Aspect Ratio + Augmentation)(V2))/")))

# Shuffle the directory
random.shuffle(wasteDirectory)

# Print directory class names
print(wasteDirectory)

# Resize and sort images in directory in the case they haven't already
for img in wasteDirectory:
    pathDir=sorted(list(os.listdir("/content/gdrive/My Drive/Rivers V2/Datasets/Waste Dataset - Pre-processing (image resizing 100x100 (Aspect Ratio + Augmentation)(V2))/"+img)))
    for i in pathDir:
        imagewaste = cv2.imread("/content/gdrive/My Drive/Rivers V2/Datasets/Waste Dataset - Pre-processing (image resizing 100x100 (Aspect Ratio + Augmentation)(V2))/"+img+'/'+i)
        imagewaste = cv2.resize(imagewaste, (100,100))
        imagewaste = img_to_array(imagewaste)

        # Assign dataset to data array
        wastedata.append(imagewaste)
        l = label = img

        # Append to labels array
        labels.append(l)

Output>> ['Clear Plastic Bottle', 'Clear Glass Bottle', 'Clear Plastic Bags', 'Coloured Plastic Bags', 'Cans', 'Coloured Plastic Bottle']

jmsv6
  • 3
  • 1

1 Answers1

0

The model's predictions, i.e. these floating point numbers, are the probabilities for the respective classes (e.g. a value of 6.734e-1 = 6.734 * 10 ** (-1) indicating a probability of 67.34%). Your prediction is then the element in your array of classes at the index of the maximum value in your array of probabilities, meaning, you want to predict whichever class gets assigned the highest probability by your model. Example:

classes = ['Clear Plastic Bottle', 'Clear Glass Bottle', 'Clear Plastic Bags', 'Coloured Plastic Bags', 'Cans', 'Coloured Plastic Bottle']
probs = [9.9152815e-01, 1.2046337e-03, 1.4043533e-03, 5.7380428e-03, 6.7023984e-06, 1.1799879e-04]
max_prob = max(probabilities)
pred = classes[probabilities.index(max_prob)]
print(f'Model predicts a {max_prob*100:.2f}% chance of the item on the image being "{pred}".')

outputs

Model predicts a 99.15% probability of the item on the image being "Clear Plastic Bottle".
Michael Hodel
  • 2,845
  • 1
  • 5
  • 10
  • Thanks for the swift reply! So then would it be safe to assume that for Waste Item No.1 for example that its probabilities are 99% Clear Plastic Bottle, 12% Clear Glass Bottle, 14% Clear Plastic Bag, 57% Coloured Plastic Bag, 67% Cans, and ~12% a Coloured Plastic Bottle? I wouldn't be too worried if I couldn't get it more neat with class names and percents, but my worry is if the class names are mixed around from the order they are in the print from the assigning labels or how I should know what each floating point probability number corresponds to which class! Thanks! – jmsv6 Apr 19 '22 at 00:07
  • Yes. If the order corresponds to the order of the one-hot-encoded classes in the data used during training. – Michael Hodel Apr 19 '22 at 00:11
  • Thanks again! I'm starting to get it! I'm undertaking research hence why I'm wanting to understand everything and document what I've done concisely! The code I've used is a mix of stuff previously used and some Fine-Tuning added so it could work with my own dataset. My problem was ensuring that when the model was being trained is knowing the order that the classes was trained. Is there a way of establishing the order of training? just for reassurance sake? Thanks! – jmsv6 Apr 19 '22 at 00:24
  • Also sometimes it throws out decimal numbers for the classes like this Probability that image is a: [[0.5779088 0.00644592 0.39918187 0.00541931 0.00093419 0.01010992]] Waste Item No.7 . – jmsv6 Apr 19 '22 at 01:51