Using tensorflow.data to generate dataset of images and multiple labels

Question

I am trying to train a neural network to draw a bounding box around an object. I have generated the data myself, 256x256 rgb images and five labels per image (two corners of bounding box + a rotational component). In order to not run out of memory when training the network using python 3.7.6, tensorflow 2.0 and keras I have only loaded a small number of images at a time. Then the network has trained on these and then a new set of images is loaded. This all happens sequentially (and me not being a very good programmer probably not in an efficient way) which has left me with what appears to be a quite severe bottleneck occuring due to the way I load images and labels. Image names are given as numerical values and are currently saved as .jpg and my labels are stored in a textfile where each row corresponds to an image name.

To reduce/eliminate the bottleneck I have read about tf.data and tried to follow the example in https://www.tensorflow.org/tutorials/load_data/images#using_tfdata_for_finer_control. However these examples deal with classification and therefore the labels are generated in a different way. I have tried to alter that code like this

import numpy as np
import tensorflow as tf
import os

height=256
width=256

AUTOTUNE = tf.data.experimental.AUTOTUNE
image_count=len(os.listdir('images_train'))

list_ds = tf.data.Dataset.list_files(str('images_train/*'), shuffle=False)
list_ds = list_ds.shuffle(image_count, reshuffle_each_iteration=False)

print('\n')
for f in list_ds.take(5):
    
    print(f.numpy())
   
def decode_img(img):
    # convert the compressed string to a 3D uint8 tensor
    img = tf.image.decode_jpeg(img, channels=3)
    # resize the image to the desired size
    return tf.image.resize(img, [height, width])  

#This is the function I cannot figure out how to write
def get_label(file_path):
    labels=np.loadtxt('labels_train.txt', delimiter=',')
    # convert the path to a list of path components
    parts = tf.strings.split(file_path, os.path.sep)
    
    #somehow I would like to extract the name of the image file and then take the numerical part and 
    #return the corresponding row
    return labels[0,:]
   
def process_path(file_path):
    label = get_label(file_path)
    # load the raw data from the file as a string
    img = tf.io.read_file(file_path)
    img = decode_img(img)
    return img, label

train_ds = list_ds.map(process_path, num_parallel_calls=AUTOTUNE)

When I just return a row from the file the rest of the script seems to run fine, but I cannot figure out how to make it so that each image is paired with its corresponding label. In order to extract the image name within the get_label() function I have tried to use parts.numpy() which only yields this AttributeError: 'Tensor' object has no attribute 'numpy'.

I have been trying to figure this out for a few days now and have not been able to find a post that quite describes the same issue.

How does one solve this issue in a an efficient way, without being a skilled programmer? Anything that points me in the right direction is greatly appreciated.

EDIT: I ended up going with a different solution which was heavily inspired by the example found here https://github.com/kalaspuffar/tensorflow-data/blob/master/create_dataset.py. It found it easier for me to follow along in the example given there.

You can rewrite the get_label() function as `def get_label(file_path): labels = np. genfromtxt('labels_train.txt',dtype='str',delimiter=',') labels_list = [i.split('.')[0] for i in labels]` labels_list contains numeric part corresponding to each row. Thanks — , Nov 24 '20 at 12:12

Using tensorflow.data to generate dataset of images and multiple labels

0 Answers0