2

I am new to the tf.data API and trying to use it to load images from disk in the Dogs vs. Cats Redux: Kernels Edition Kaggle competition. To do this, I first created a pandas DataFrame named train_df with two columns - file_path containing the relative path of images and target containing the target labels 0 (for cat) and 1(for dog). Here's how the first 10 rows of the DataFrame looks like:

enter image description here

Then, I tried loading the images with the following code:

import tensorflow as tf


BATCH_SIZE = 128
IMG_HEIGHT = 224
IMG_WIDTH = 224

def read_images(X, y):
    X = tf.io.read_file(X)
    X = tf.io.decode_image(X, expand_animations=False, dtype=tf.float32, channels=3)
    X = tf.image.resize(X, [IMG_HEIGHT, IMG_WIDTH])
    X = tf.keras.applications.efficientnet.preprocess_input(X, data_format="channels_last")

    return (X, y)


def build_data_pipeline(X, y):
    data = tf.data.Dataset.from_tensor_slices((X, y))
    data = data.map(read_images)
    data = data.batch(BATCH_SIZE)
    data = data.prefetch(tf.data.AUTOTUNE)

    return data


tf_data = build_data_pipeline(train_df["file_path"], train_df["target"])

After this, I tried training my model using the following code

model.fit(tf_data, epochs=10)

but got a training accuracy of only 50% whereas with ImageDataGenerator, I am getting an accuracy of 99%. Thus, the problem lies somewhere in the data loading part which I am not able find out.

I have used EfficientNetB0 with weights trained from imagenet as feature extractor and single neuron layer at the end as classifier.

Pretrained EfficientNetB0 model:

pretrained_model = tf.keras.applications.EfficientNetB0(
    input_shape=(IMG_HEIGHT, IMG_WIDTH, 3),
    include_top=False,
    weights="imagenet"
)

for layer in pretrained_model.layers:
    layer.trainable = False

Dense layer with one neuron at the end of the EfficientNetB0:

pretrained_output = pretrained_model.get_layer('top_activation').output
x = tf.keras.layers.GlobalAveragePooling2D()(pretrained_output)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.Dense(1, activation="sigmoid")(x)

model = tf.keras.models.Model(pretrained_model.input, x)

Compiling the model:

model.compile(
    optimizer="adam",
    loss="binary_crossentropy",
    metrics=["accuracy"]
)

1 Answers1

1

In the above notebook, change the input reading function read_images as follows:

def read_images(X, y):
    X = tf.io.read_file(X)
    X = tf.image.decode_jpeg(X, channels = 3)
    X = tf.image.resize(X, [IMG_HEIGHT, IMG_WIDTH]) #/255.0
    return (X, y)

Also note that, tf.keras.applications.EfficientNet-Bx has in-built normalization layer. So, it's better not to normalize the data in the above function (i.e. /255.0).

Innat
  • 16,113
  • 6
  • 53
  • 101
  • 1
    This solved the problem :) Thanks for putting time and effort into it. However, I still don't understand why didn't the decode_image function worked.The documentation says this: _Detects whether an image is a BMP, GIF, JPEG, or PNG, and performs the appropriate operation to convert the input bytes string into a Tensor of type dtype._ All the images in this problem were of JPEG format so tf.io.decode_jpeg worked by what If a dataset consists of images of different types? – Gulshan Mishra Dec 26 '21 at 10:57
  • For other common format (jpeg, png), it's oka to use any of the following: [`tf.image.decode_jpeg`](https://www.tensorflow.org/api_docs/python/tf/io/decode_jpeg) or [`tf.image.decode_png`](https://www.tensorflow.org/api_docs/python/tf/io/decode_png). Read the respected document, it's mentioned there. – Innat Jan 10 '22 at 10:29
  • About the cause of `tf.image.decode_image`, I didn't care while giving an answer, I just used what I normally prefer. I'm not sure what's wrong with that, I may need to look again. If you're interested or it's causing any further issues, please feel free to ask a new question and ping me if you want. – Innat Jan 10 '22 at 10:32