Tensorflow: predicting a point from an image, training model with a point labels

Question

I want to create a model which can predict a point from an image. I have a dataset with training images. Those images are splitted between 24 dirs. I have prepared an json file containing a (x, y) values for every image.

example:

"dir22": {
        "frame_00001_rgb": {
            "x": 363.693829827852,
            "y": 278.2191728859505
        },
        "frame_00002_rgb": {
            "x": 330.9709780765119,
            "y": 283.34142472069004
        },
...
...
"dir23": {
        "frame_00001_rgb": {
            "x": 212.5232358000000,
            "y": 156.3342191728855
        },
        "frame_00002_rgb": {
            "x": 230.69497097807351,
            "y": 253.75341424720690
        },

My model looks like this:


img_width, img_height = 640, 480

train_data_dir = 'v_data/train'

epochs = 10
batch_size = 16

input_tensor = tf.keras.Input(shape=(img_width,img_height,3))
base_model = tf.keras.applications.ResNet50(weights='imagenet',include_top=False ,input_tensor=input_tensor)

top_model = tf.keras.Sequential()
top_model.add(tf.keras.Flatten(input_shape=base_model.output_shape[1:]))
top_model.add(tf.keras.Dense(128, activation='relu'))
top_model.add(tf.keras.Dense(128, activation='relu'))
top_model.add(tf.keras.Dense(2))

model = tf.keras.Model(input= base_model.input, output= top_model(base_model.output))

for layer in model.layers[-15:]:
    layer.trainable = False


optimizer = tf.keras.optimizers.RMSprop(0.001)

model.compile(loss='mse',
            optimizer=optimizer,
            metrics=['mae', 'mse'])

now I have loaded images from my dir:

train_datagen = tf.keras.preprocessing.image.ImageDataGenerator()

train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size)

Found 15678 images belonging to 24 classes.

now how can I assign a label for each image and train my model with it?

Default activation is relu for Dense, I think you should scale your outputs between 0 and 1 and use sigmoid instead and then rescale them — SajanGohil, Dec 29 '19 at 12:10
This probably needs to be tackled as a regression problem. Not a classification problem. I hope that's what you have in mind. That aside, seems you already have x, y coordinates prepared. So what exactly is the question? Is it how to provide the labels in the JSON file along side the data of `train_datagen`? — thushv89, Dec 30 '19 at 09:28
Yes, that's what I'm looking for. Also I have made last three layers as described here: https://www.tensorflow.org/tutorials/keras/regression. Should I do something additional for this, as a regression problem? — Jakub Balicki, Dec 30 '19 at 11:41
@JakubBalicki no I think you're fine. I was just pointing that out because you used the word "label". But while reading, I realized you are talking about a regression problem. — thushv89, Dec 30 '19 at 23:03

score 3 · Accepted Answer · answered Dec 30 '19 at 20:49

For this, you need to write a custom data generator.

Importing necessary libraries

import os
import pandas as pd
from skimage.io import imread # Used for image processing
from skimage.transform import resize # Used for image processing
import json
import numpy as np

Defining our own data generator

I followed this link to get an idea of how to do this. And customized it to your problem.

We need to fill in the following functions.

class DataGenerator(tf.keras.utils.Sequence):
    'Generates data for Keras'



    def __init__(self, directory, target_json, batch_size=32, target_size=(128, 128), shuffle=True):
        ...

    def __len__(self):
        'Denotes the number of batches per epoch'
        ...

    def __getitem__(self, index):
        'Generate one batch of data'
        ...

    def on_epoch_end(self):
        'Updates indexes after each epoch'
        ...

    def __data_generation(self, list_paths, list_paths_wo_ext):
        'Generates data containing batch_size samples' # X : (n_samples, *dim, n_channels)
        ...

Let's look at what variables we define

self.target_size = # Final size of the images
self.batch_size = # Batch size
self.target_json = # Path to the json file
self.directory = # Where the training data is

self.img_paths = # Contains image paths with extension
self.img_paths_wo_ext = # Contains the image paths without extension

self.targets = # The dataframe containing targets loaded from the json
self.shuffle = # Shuffle data at start of each epoch?

The JSON file

Your JSON file needs to be exactly in this format. This is probably what you have exactly too. But make sure it's 100% this format.

{'dir20': {'frame_00001_rgb': {'x': 363.693829827852, 'y': 278.2191728859505}, 'frame_00002_rgb': {'x': 330.9709780765119, 'y': 283.34142472069004}}, 'dir21': {'frame_00001_rgb': {'x': 363.693829827852, 'y': 278.2191728859505}, 'frame_00002_rgb': {'x': 330.9709780765119, 'y': 283.34142472069004}}, 'dir22': {'frame_00001_rgb': {'x': 363.693829827852, 'y': 278.2191728859505}, 'frame_00002_rgb': {'x': 330.9709780765119, 'y': 283.34142472069004}}, 'dir23': {'frame_00001_rgb': {'x': 363.693829827852, 'y': 278.2191728859505}, 'frame_00002_rgb': {'x': 330.9709780765119, 'y': 283.34142472069004}}, 'dir24': {'frame_00001_rgb': {'x': 212.5232358, 'y': 156.3342191728855}, 'frame_00002_rgb': {'x': 230.6949709780735, 'y': 253.7534142472069}}}

Next we need to convert this to a pandas dataframe. For that we define the following function. It's a bit complex due to the nature of your file. But here's what's happening.

Load the json and create a dataframe which contain columns like dir20.frame_00002_rgb.x.
Create a multi index by splitting the column to 3 levels (e.g. dir20, frame_00002, x)
Use stack to bring both dir* and frame_* as indices
Reformat the index so that it contains the full path for each image and each record has two columns (x and y).

def json_to_df(json_path, directory):
          with open(json_path,'r') as f:
            s = json.load(f)
          df = pd.io.json.json_normalize(s)
          ind = pd.MultiIndex.from_tuples([col.split('.') for col in df.columns])
          df.columns = ind
          df = df.stack(level=[0,1])
          df = df.set_index(df.index.droplevel(0))
          df = df.set_index(pd.Index([os.path.sep.join([directory]+list(c)) for c in df.index.values]))
          return df

Rest of the code

I won't go in to great details of what's going on in the other parts as it is quite straight forward. But we're essentially getting a single batch of data by reading the images, resizing and getting the correct x, y values from the dataframe we generated.

Full code

Here's the full code for the data generator.

class DataGenerator(tf.keras.utils.Sequence):
    'Generates data for Keras'



    def __init__(self, directory, target_json, batch_size=32, target_size=(128, 128), shuffle=True):
        'Initialization'
        self.target_size = target_size
        self.batch_size = batch_size
        self.target_json = target_json
        self.directory = directory

        self.img_paths = [] 
        self.img_paths_wo_ext = []      
        for root, dirs, files in os.walk(directory):
            for file in files:
                if file.lower().endswith(".jpg") or file.lower().endswith(".png"):
                    self.img_paths.append(os.path.join(root, file))
                    self.img_paths_wo_ext.append(os.path.splitext(os.path.join(root, file))[0])

        def json_to_df(json_path, directory):
          with open(json_path,'r') as f:
            s = json.load(f)
          df = pd.io.json.json_normalize(s)
          ind = pd.MultiIndex.from_tuples([col.split('.') for col in df.columns])
          df.columns = ind
          df = df.stack(level=[0,1])
          df = df.set_index(df.index.droplevel(0))
          df = df.set_index(pd.Index([os.path.sep.join([directory]+list(c)) for c in df.index.values]))
          return df

        self.targets = json_to_df(self.target_json, self.directory)
        self.shuffle = shuffle
        self.on_epoch_end()



    def __len__(self):
        'Denotes the number of batches per epoch'
        return int(np.floor(len(self.img_paths) / self.batch_size))

    def __getitem__(self, index):
        'Generate one batch of data'
        # Generate indexes of the batch
        indexes = self.indexes[index*self.batch_size:(index+1)*self.batch_size]

        # Find list of IDs
        list_paths = [self.img_paths[k] for k in indexes]
        list_paths_wo_ext = [self.img_paths_wo_ext[k] for k in indexes]
        # Generate data
        X, y = self.__data_generation(list_paths, list_paths_wo_ext)

        return X, y

    def on_epoch_end(self):
        'Updates indexes after each epoch'
        self.indexes = np.arange(len(self.img_paths))
        if self.shuffle == True:
            np.random.shuffle(self.indexes)

    def __data_generation(self, list_paths, list_paths_wo_ext):
        'Generates data containing batch_size samples' # X : (n_samples, *dim, n_channels)
        # Initialization
        X = np.empty((self.batch_size, *self.target_size, 3))
        y = self.targets.loc[list_paths_wo_ext].values

        # Generate data
        for i, ID in enumerate(list_paths):
            # Store sample

            X[i,] = resize(imread(ID),self.target_size)


        return X, y

Using the datagenerator

Here's how you'd use the data generator.

train_datagen = iter(DataGenerator(train_data_dir, './train/data.json', batch_size=2))


x, y = next(train_datagen)
print(x)
print(y)

which gives,

[[0.01377145 0.01377145 0.01377145]
   [0.00242393 0.00242393 0.00242393]
   [0.         0.         0.        ]
   ...
   [0.0037837  0.0037837  0.0037837 ]
   [0.0037837  0.0037837  0.0037837 ]
   [0.0037837  0.0037837  0.0037837 ]]

  ...

  [[0.37398897 0.3372549  0.17647059]
   [0.38967525 0.35294118 0.19215686]
   [0.42889093 0.39215686 0.23137255]
   ...
   [0.72156863 0.62889093 0.33085172]
   [0.71372549 0.61176471 0.31764706]
   [0.70588235 0.59359681 0.30340074]]]]

[[363.69382983 278.21917289]
 [330.97097808 283.34142472]]

Thank you for your comprehensive explaination. That was exactly what I was looking for. I really appreciate the time and effort you put to answer my question :) — Jakub Balicki, Dec 31 '19 at 08:42