How to deal with images and masks using tf.dataset in a semantic segmentation task?

Question

My data does not fit into memory so I need to use the equavalent to flow_from_directory of the ImageDataGenerator class, but that supports tensorflow_datasets. I found image_dataset_from_directory, a utility function of keras that generates a tf.data.Dataset from image files in a directory. So I loaded my data (images & masks) as follows.

BATCH_SIZE = None
IMG_HEIGHT = 256
IMG_WIDTH = 256
IMG_CHANNELS=1
seed=42
# setting dictionary for the tf_data_dataset parameters
tf_Dataset_args=dict(labels=None,
                     label_mode=None,
                     validation_split=0.2,
                     batch_size=BATCH_SIZE,
                     image_size=(IMG_HEIGHT, IMG_WIDTH),
                     seed=seed,
                     color_mode="grayscale"
                     )
#---------- train image split train/val
# image_dataset_from_directory is a utility function of keras taht generates a tf.data.Dataset from image files in a directory.
# And tf.data.Dataset represents a potentially large set of elements.
train_image_ds = tf.keras.utils.image_dataset_from_directory(train_images_path,
                                                             subset="training",
                                                             **tf_Dataset_args
                                                             )

validation_image_ds = tf.keras.utils.image_dataset_from_directory(train_images_path,
                                                                  subset="validation",
                                                                  **tf_Dataset_args
                                                                  )
#----------- train masks split train/val
train_masks_ds = tf.keras.utils.image_dataset_from_directory(train_masks_path,
                                                             subset="training",
                                                             **tf_Dataset_args
                                                             )
validation_masks_ds = tf.keras.utils.image_dataset_from_directory(train_masks_path,
                                                                  subset="validation",
                                                                  **tf_Dataset_args
                                                                  )

Then I combined images and masks to create tf.datset:

#The simplest way to create a dataset is to create it from a python list:  nested structure of iamges and masks
train_set=list(zip(train_image_ds, train_masks_ds))#
validation_set=list(zip(validation_image_ds, validation_masks_ds))
training_data = tf.data.Dataset.from_tensor_slices(train_set)# Represents a potentially large set of elements.
validation_data = tf.data.Dataset.from_tensor_slices(validation_set)# I tried zip inside but did not work

Elements of my training and validation are of shape (nb_images,2,256,256,1) or (nb_images/batch_size,2,batch_size,256,256,1) if batch_size is not None.

Adding the below dataugmentation block and passing

data_augmentation = tf.keras.Sequential([
    
    tf.keras.layers.experimental.preprocessing.Rescaling(1./255),
    tf.keras.layers.experimental.preprocessing.RandomFlip("horizontal_and_vertical"),
    tf.keras.layers.experimental.preprocessing.RandomRotation(0.5),
    # tf.keras.layers.experimental.preprocessing.RandomTranslation(0.3)
    # tf.keras.layers.experimental.preprocessing.RandomHeight(0.1),
    # tf.keras.layers.experimental.preprocessing.RandomWidth(0.1)
    ])

I get

WARNING:tensorflow:Layers in a Sequential model should only have a single input tensor. Received: inputs=<TensorSliceDataset element_spec=TensorSpec(shape=(2, 256, 256, 1), dtype=tf.float32, name=None)>. Consider rewriting this model with the Functional API. ValueError: Exception encountered when calling layer "rescaling" (type Rescaling).

Attempt to convert a value (<TensorSliceDataset element_spec=TensorSpec(shape=(2, 256, 256, 1), dtype=tf.float32, name=None)>) with an unsupported type (<class 'tensorflow.python.data.ops.dataset_ops.TensorSliceDataset'>) to a Tensor.

Call arguments received: • inputs=<TensorSliceDataset element_spec=TensorSpec(shape=(2, 256, 256, 1), dtype=tf.float32, name=None)>

I also found problem to pass the tf.dataset training_data to the .fit method because of shape inconsistency with the model input shape (None,256,256,1)

score 1 · Answer 1 · edited Mar 17 '23 at 13:21

It is possible but you need to start from the ImgaeData generator and the dataset and dataflow are different, it spends a bit of time but I adding masks into data that equivalent shapes and zip method is not supported also the for each element from the dataset as well.

Dataset and imagedata generator does not take long process unless you create a batch with training step by model.fit or model.fit_generator.

[ Sample ]:

import os
from os.path import exists
import tensorflow as tf

# import h5py

"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
None
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
physical_devices = tf.config.experimental.list_physical_devices('GPU')
assert len(physical_devices) > 0, "Not enough GPU hardware devices available"
config = tf.config.experimental.set_memory_growth(physical_devices[0], True)
print(physical_devices)
print(config)


"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: Variables
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
directory = "F:\\datasets\\downloads\\example\\image\\"
mask_directory = "F:\\datasets\\downloads\\example\\image\\"

BATCH_SIZE = 1
IMG_HEIGHT = 32
IMG_WIDTH = 32
IMG_CHANNELS=1
seed=42

train_steps = 1
val_steps = 1
epochs = 1

"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: Definition / Class
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
def train_image_gen():

    train_generator = tf.keras.preprocessing.image.ImageDataGenerator(
        rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True,
        
        validation_split=0.2,
        
        )
        
    train_image_ds = train_generator.flow_from_directory(
        directory,
        target_size=(IMG_HEIGHT, IMG_WIDTH),
        batch_size=BATCH_SIZE,
        class_mode='binary',    # None  # categorical   # binary
        subset='training',
        
        color_mode='grayscale',
        seed=seed,
        
        )
        
    train_masks_ds = train_generator.flow_from_directory(
        mask_directory,
        target_size=(IMG_HEIGHT, IMG_WIDTH),
        batch_size=BATCH_SIZE,
        class_mode='binary',    # None  # categorical   # binary
        subset='training',
        
        color_mode='grayscale',
        seed=seed,
        
        )
    
    list_train_image = [ ]
    list_train_image_mask = [ ]
    list_label = [ ]
    list_label_mask = [ ]
    
    iCount = len( train_image_ds )
    iCurrent = 0
    
    for element in train_image_ds:
        if ( iCurrent < iCount ):
            list_train_image.append([ element[0] ])
            list_label.append([ element[1] ])
        else:
            break
            
        iCurrent = iCurrent + 1
    

    iCount = len( train_masks_ds )  
    iCurrent = 0
    
    for element in train_masks_ds:
        if ( iCurrent < iCount ):
            list_train_image_mask.append([ element[0] ])
            list_label_mask.append([ element[1] ])
        else:
            break
            
        iCurrent = iCurrent + 1
    
    list_train_image = tf.constant(tf.cast(list_train_image, dtype=tf.int64), shape=(16, 1, 32, 32, 1), dtype=tf.int64)
    list_label = tf.constant(tf.cast(list_label, dtype=tf.int64), shape=(16, 1, 1), dtype=tf.int64)
    list_train_image_mask = tf.constant(tf.cast(list_train_image_mask, dtype=tf.int64), shape=(16, 1, 32, 32, 1), dtype=tf.int64)
    list_label_mask = tf.constant(tf.cast(list_label_mask, dtype=tf.int64), shape=(16, 1, 1), dtype=tf.int64)
    train_image_ds = tf.data.Dataset.from_tensor_slices(( [list_train_image, list_train_image_mask], [list_label, list_label_mask] ))
        
    return train_image_ds
    
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: Model Initialize
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
model = tf.keras.models.Sequential([
    tf.keras.layers.InputLayer(input_shape=( 1,IMG_HEIGHT, IMG_WIDTH, 1 )),
    tf.keras.layers.Reshape((IMG_HEIGHT, IMG_WIDTH, 1)),
    tf.keras.layers.RandomFlip('horizontal'),
    tf.keras.layers.RandomRotation(0.2),
    tf.keras.layers.Normalization(mean=3., variance=2.),
    tf.keras.layers.Normalization(mean=4., variance=6.),
    tf.keras.layers.Conv2D(32, (3, 3), activation='relu'),
    tf.keras.layers.Reshape((30, 30, 32)),
    tf.keras.layers.MaxPooling2D((2, 2)),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Reshape((128, 225)),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(96, return_sequences=True, return_state=False)),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(96)),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(192, activation='relu'),
    tf.keras.layers.Dense(10),
])

"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: Optimizer
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
optimizer = tf.keras.optimizers.Nadam(
    learning_rate=0.0001, beta_1=0.9, beta_2=0.999, epsilon=1e-07,
    name='Nadam'
) # 0.00001

"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: Loss Fn
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""                               
lossfn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)

"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: Model Summary
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
model.compile(optimizer=optimizer, loss=lossfn, metrics=['accuracy'])

"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: Training
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
history = model.fit(train_image_gen(), validation_data=train_image_gen(), batch_size=100, epochs=50 )

input( '..;.' )

How to deal with images and masks using tf.dataset in a semantic segmentation task?

1 Answers1