2

I'm trying to run a model and validate it's using Stratified K-fold validation. I have stored the training and testing images together in a new folder and stored the ground truths of both training and testing in a CSV for taking a label.

I'm using binary_crossentropy as loss function as I'm working on binary classification.

The CSV file contains 2 columns: Image(The name of the image) and ID(the label of the corresponding image).

Here is the code:

EPOCHS = 1
N_SPLIT = 3

image_dir = 'path of the folder where all the image is contained'

image_label = pd.read_csv('groundtruths of the images.csv')
df = image_label.copy()
    
main_pred = [] #a list to store the scores of each fold
error = [] #
data_kfold = pd.DataFrame()

train_y = df.ID #stores the label of the images
train_x = df.Image #stores the name of the images

train_datagen=ImageDataGenerator(horizontal_flip=True,vertical_flip=True,rotation_range=90) #data augmentation
validation_datagen = ImageDataGenerator()
kfold = StratifiedKFold(n_splits=N_SPLIT,shuffle=True,random_state=42) #making folds

j = 0 # a variable to count the fold number

for train_idx, val_idx in list(kfold.split(train_x,train_y)):
    x_train_df = df.iloc[train_idx] #training data after split
    x_valid_df = df.iloc[val_idx] #validation data after split
    j+=1
    #loading training images
    training_set = train_datagen.flow_from_dataframe(dataframe=x_train_df, directory=image_dir,
                                                 x_col="Image", y_col="ID",
                                                 class_mode=None,
                                                 target_size=(image_size,image_size), batch_size=batch_size)
    #loading validation images
    validation_set = validation_datagen.flow_from_dataframe(dataframe=x_valid_df, directory=image_dir,
                                                 x_col="Image", y_col="ID",
                                                 class_mode=None,
                                                 target_size=(image_size,image_size), batch_size=batch_size)

    #training THIS IS THE LINE WHERE THE ERROR OCCURS
    history = parallel_model.fit(training_set,
                                 validation_data=validation_set,
                                 epochs = EPOCHS,
                                 steps_per_epoch=x_train_df.shape[0] // batch_size
                                 )

    test_generator = ImageDataGenerator(rescale = 1./255)

    test_set = test_generator.flow_from_dataframe(dataframe=image_label, directory=image_dir,
                                                 x_col="Image",y_col=None,
                                                 class_mode=None,
                                                 target_size=(image_size,image_size))

    pred= parallel_model.predict_generator(test_set, len(image_label) // batch_size)
    predicted_class_indices=np.argmax(pred,axis=1)
    data_kfold[j] = predicted_class_indices
    gc.collect()

The error I got:

Found 800 validated image filenames.
Found 400 validated image filenames.
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-26-6b473ab35caf> in <module>()
     20                                  validation_data=validation_set,
     21                                  epochs = EPOCHS,
---> 22                                  steps_per_epoch=x_train_df.shape[0] // batch_size
     23                                  )
     24 

1 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py in autograph_handler(*args, **kwargs)
   1127           except Exception as e:  # pylint:disable=broad-except
   1128             if hasattr(e, "ag_error_metadata"):
-> 1129               raise e.ag_error_metadata.to_exception(e)
   1130             else:
   1131               raise

TypeError: in user code:
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 878, in train_function  *
        return step_function(self, iterator)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 867, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 860, in run_step  **
        outputs = model.train_step(data)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 813, in train_step
        f'Target data is missing. Your model has `loss`: {self.loss}, '

    TypeError: Target data is missing. Your model has `loss`: binary_crossentropy, and therefore expects target data to be passed in `fit()`.
arvind okram
  • 63
  • 1
  • 6

1 Answers1

0

This is an error with your dataset.

When training, Tensorflow expects an input to the model, and an output label as ground truth.

Try iterating over your tf.data.Dataset object and see if it turns just a single value (the cause of this error) or 2 values in a tuple (in the form (model_input, label)).

Example wrong output:

for item in dataset:
    print(item) # OR, in the output below for brevity, item.shape
(64, 64, 64, 1)
(64, 64, 64, 1)
(64, 64, 64, 1)
(64, 64, 64, 1)
(64, 64, 64, 1)
...

Example correct output:

for item in dataset:
    print([ i.shape for i in item ])
[TensorShape([64, 64, 64, 1]), TensorShape([64, 64, 64, 2])]
[TensorShape([64, 64, 64, 1]), TensorShape([64, 64, 64, 2])]
[TensorShape([64, 64, 64, 1]), TensorShape([64, 64, 64, 2])]
[TensorShape([64, 64, 64, 1]), TensorShape([64, 64, 64, 2])]
...

Examples taken from a model I'm working on.

starbeamrainbowlabs
  • 5,692
  • 8
  • 42
  • 73