2

I am trying to load a dataset of images into tensorflow but I am facing a problem to load it properly. Actually, I have a folder named PetImages in C drive which contains two folders with the name of cat and dog. Each folder holds more 12450 images so in total it is 24500 plus images. I am loading them with the following code:

import numpy as np
import matplotlib.pyplot as plt
import os
import cv2
DATADIR = "C:\Datasets\PetImages"
CATEGORIES = ["Dog","Cat"]
for the category in CATEGORIES:
path = os.path.join(DATADIR, category)

for img in os.listdir(path):
    img_array = cv2.imread(os.path.join(path,img), cv2.IMREAD_GRAYSCALE)
    plt.imshow(img_array, cmap="gray")
    plt.show()
    break
break

The result of code looks absolutely fine and it shows the first image of the folder. Then I am converting the shape of the whole array into desired pixel rate with following code:

IMG_SIZE=50
new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))
plt.imshow(new_array, cmap = "gray")
plt.show()

This part is also fine but then I want to mix(shuffle) the images so that I can puzzle the system and check the accuracy in this way but problem is the it only shows 12450 images resut after this code:

training_data = []
def create_training_data():
for category in CATEGORIES:
    path = os.path.join(DATADIR, category)
    class_num = CATEGORIES.index(category)
for img in os.listdir(path):
    try:
        img_array = cv2.imread(os.path.join(path,img), 
cv2.IMREAD_GRAYSCALE)
        new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))
        training_data.append([new_array, class_num])
    except Exception as e:
        pass       
create_training_data()
print(len(training_data)

Then using the random I am not getting the success to shuffle images from both folders, its only shows the values of one folder.

import random   
random.shuffle(training_data)
for the sample in training_data[:10]:  
print(sample[1])

But my result is 1 1 1 1 1 instead of randomly generated like 0 1 0 1 0 0 0 1 1 this style I mean unpredicted the next will be 1 or 0.

Your help will be valuable to me. Thanks in advance

3 Answers3

3

Looks to me like an indentation error. Your second for loop lies outside of your first for loop, which causes the first loop to terminate completely and set class_num to 1 before the second loop is ever entered. You probably want to nest them. Try:

def create_training_data():
    for category in CATEGORIES:
        path = os.path.join(DATADIR, category)
        class_num = CATEGORIES.index(category)
        for img in os.listdir(path):
            try:
                img_array = cv2.imread(os.path.join(path,img), cv2.IMREAD_GRAYSCALE)
                new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))
                training_data.append([new_array, class_num])
            except Exception as e:
                pass       
create_training_data()
print(len(training_data)
dmurphy1
  • 31
  • 3
0

You can try to shuffle a mask or index of the training data

import random
index=[k for k in range(len(training_data))]
shuffIndex=random.shuffle(index)
shuffTrainigData=[training_data[val] for val in shuffIndex]

Hope it helps

TavoGLC
  • 889
  • 11
  • 14
0

your code only loaded the Dog data the training data, hence the 12450 for training lentgh. this means you are only shuffling dog images, which will give you 1s. your training length should be aprox 25000. fix yo indentations and you should be okay.

rugare
  • 1
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Jun 11 '22 at 06:22