-2

I am making a image-classification CNN and I am trying to create a h5py file with the preprocessed images but I get a ValueError, which says "zero-size array to reduction operation maximum which has no identity". Can someone please help me: I don't have loads of experience with python and coding so I might not fully understand what people explain. This is the code: '''decalre image properties

IMG_WIDTH = 64
IMG_HEIGHT = 64
CHANNELS=3
preprocess_images = ()
'''declare some training hyperparameters'''
BATCH_SIZE = 50
EPOCHS = 100

import cv2
import numpy as np 
from glob import glob
import matplotlib.pyplot as plt
import sys
from PIL import Image
import os
import random
import keras
from keras.utils import to_categorical
import h5py
import json
import tensorflow as tf
import keras
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
import json
import sklearn.datasets

from keras.models import Sequential,Input,Model
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.layers.normalization import BatchNormalization
from keras.layers.advanced_activations import LeakyReLU
from keras.regularizers import l2

#declare the directories that hold the training and test data
train_dir='/content/gdrive/MyDrive/Images/recycling'
test_dir ='/content/gdrive/MyDrive/Images/non-recyclable'
to_h5py(train_dir)
to_h5py(test_dir)

the issue is with the function 'to_h5py'but I don't know how too solve it.

    def to_h5py(pth):
      #get list of folders and classes
      (folders,classes)= get_folders_and_classes(pth)
       #set the file name that the dataset will be saved under


  input_fname = 'processed_datasets/' +get_end_slash(pth)
  #checks if this particular file already exists and asks the user if it should be overwritten


  if (os.path.isfile(input_fname)):
    inp= input('overwrite file' + input_fname + '?, y/n')
    if inp.lower()=="y":
      print("file will be overwritten")
      os.remove(input_fname)
    elif inp.lower()=="n":
      input_fname = input("enter a new filename: ") + '.h5' 
      print(input_fname)
    else:
      print("incorrect input, preprocessing failed")
      return

  #create h5py file to store dataset
  hf=h5py.File(input_fname)
  #get list of all images and number of images


  all_images = glob(pth+"**/*.jpg", recursive=True)
  n_images = len(all_images)
  #create dataset X and label list


  X=hf.create_dataset(
    name="X",
    shape=(n_images, IMG_WIDTH, IMG_HEIGHT, CHANNELS),
    maxshape=(None,IMG_WIDTH,IMG_HEIGHT,None),
    compression="gzip",
    compression_opts=9)
  label_lis = []
  #set an index to iterate through


  x_ind=0
  #go through all the folders


  for i, folder in enumerate(folders):
    images = glob(folder+"*.jpg")
    total_images = len(images)
    print(classes[i], total_images)
    #process each image in each folder and add the class and the processed image to the image array list


    for i, image_pth in enumerate(images):
      img= process_single_img(image_pth, IMG_WIDTH, IMG_HEIGHT)
      X[x_ind] = img
      label_lis.append(i)
      print("{}/{} fname={}".format(i,total_images, get_pic_name(image_pth)))
      x_ind+=1
  #store the labels under the y set

  hf.create_dataset(
      name= 'y',
      compression="gzip",
      compression_opts=9,
      data=label_lis)
   #convert the labels to one-hot values (i.e. 2 -> [0 0 1 0 0]) if there were 5 possible values)

  y_one_hot = to_categorical(np.array(label_lis))
  hf.create_dataset(
      name= 'y_one_hot',
      compression="gzip",
      compression_opts=9,
      data=y_one_hot)
   #close the opened file


  hf.close()

''' UPDATE: this is the full traceback thank you for any help

  • This utility is supposed to create a HDF5 file from the images. Do you get a one? Are there any datasets (and data) in it? There should be 2 datasets: 'y' and 'y_one_hot'. I am having trouble following the code starting with `def to_h5py(pth):`. Is it formatted/indented correctly? Is all of that code part of function `to_h5py()`? Also, where is the function `get_folders_and_classes()` defined? Please clean-up or clarify. – kcw78 Mar 03 '21 at 22:51
  • the formatting is fine: – Avinash Ehathasan Mar 10 '21 at 18:54
  • def get_folders_and_classes(pth): folders=glob(pth+"*") classes=[get_end_slash(f) for f in folders] return(folders, classes) – Avinash Ehathasan Mar 10 '21 at 18:54
  • for some reason from a certain point the editor thinks that there aren't any images but until then the code shows that there are images where I have provided the directory paths. – Avinash Ehathasan Mar 10 '21 at 18:57

1 Answers1

1

I can reproduce the error message with:

In [213]: np.maximum.reduce([])
Traceback (most recent call last):
  File "<ipython-input-213-73052b120c74>", line 1, in <module>
    np.maximum.reduce([])
ValueError: zero-size array to reduction operation maximum which has no identity

or with np.max([]).

But without the full traceback we can't help you identify where the error occurs. Who is taking the max and why is the array in question size 0?

Unfortunately we see a lot of questions like this. People with little programming experience trying to use fairly advanced code without much understanding of what is going on. Or how to do basic debugging :(

hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • thanks but is there anything I can do specifically? the reason for the error(I think) is because python thinks there are no images in the train_dir but there is and the file path works. I'm not sure about this but I have looked at other questions and I have concluded this. – Avinash Ehathasan Mar 03 '21 at 20:48
  • If that's the case, the problem isn't with the calculation, but with the loading, and possibly your own file and directory structure. That's not something we can help with from afar! – hpaulj Mar 03 '21 at 21:47