5

I am following this tutorial on creating a custom Model using TensorFlow lite Model Maker on Collab.

import pathlib
path = pathlib.Path('/content/employee_pics') 
count = len(list(path.glob('*/*.jpg')))
count

data = ImageClassifierDataLoader.from_folder(path)
train_data, test_data = data.split(0.5)

I have an issue with step 2:

model = image_classifier.create(train_data)

I get an error: ValueError: Expect x to be a non-empty array or dataset.

enter image description here

Am I doing something wrong? The data set provided in the example works fine though. Why?

Mena
  • 3,019
  • 1
  • 25
  • 54

3 Answers3

18

This Error is caused by the size of training data is smaller than batch_size which is not allowed.

The default batch_size is 32, which means the number of training images should be no less than 32. There's no need to count the number of images per label, just need to make sure that the total training images to be at least 32.

You need to choose one of the following solutions to solve it.

  • Set batch_size smaller than training data size, e.g:
image_classifier.create(train_data, batch_size=4)
  • Increase the size of training data by adding more data.
Yuqi Li
  • 239
  • 1
  • 3
5

I just did some manual tests. Don't exactly know why, but for this binary classifier, when I increase the amount of data to make sure at least 16 image per label are used for training, it starts working.

For your case, because you split the train/test by a factor of 0.5, you need 32 images per label. Could you try if that solve your issue?

Steven
  • 321
  • 1
  • 7
1

Had the same error:

File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 1110, in fit
    raise ValueError('Expect x to be a non-empty array or dataset.')
ValueError: Expect x to be a non-empty array or dataset.

First tried reducing the batch size. If the batch size is bigger than the training dataset, then input dataset is not created and hence remain empty. But mine was not that case.

Then I tried to see where my dataset becomes empty. My first epoch ran fine but not another. Seems my dataset got transformed in batching process.

classes = len(y.unique())
model = Sequential()
model.add(Dense(10, activation='relu', 
activity_regularizer=tf.keras.regularizers.l1(0.00001)))
model.add(Dense(classes, activation='softmax', name='y_pred'))
opt = Adam(lr=0.0005, beta_1=0.9, beta_2=0.999)

BATCH_SIZE = 12
train_dataset, validation_dataset =set_batch_size(BATCH_SIZE,train_dataset,validation_dataset)
model.compile(loss='categorical_crossentropy', optimizer=opt, metrics['accuracy'])
model.fit(_train_dataset, epochs=10,validation_data=_validation_dataset,verbose=2, callbacks=callbacks)

Solution for this case: Updated redundant updation of train and validation dataset while dividing it in batch by giving different name.

Before:

train_dataset, validation_dataset = set_batch_size(BATCH_SIZE, train_dataset, validation_dataset)

After:

_train_dataset, _validation_dataset = set_batch_size(BATCH_SIZE, train_dataset, validation_dataset)




classes = len(y.unique())   
model = Sequential()
model.add(Dense(10, activation='relu',activity_regularizer=tf.keras.regularizers.l1(0.00001)))
model.add(Dense(classes, activation='softmax', name='y_pred'))  

opt = Adam(lr=0.0005, beta_1=0.9, beta_2=0.999)

BATCH_SIZE = 12

_train_dataset, _validation_dataset = set_batch_size(BATCH_SIZE, train_dataset, validation_dataset) model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])
model.fit(_train_dataset, epochs=10, validation_data=_validation_dataset, verbose=2, callbacks=callbacks)

Useful links: https://code.ihub.org.cn/projects/124/repository/commit_diff?changeset=1fb8f4988d69237879aac4d9e3f268f837dc0221