This code is preparing image data for a deep learning model to be trained on.
The classes variable is a list of object categories, and the label_data variable is a dictionary mapping image file names to their corresponding object categories.
The read_image_and_label function reads in an image file, resizes it, and converts the object categories to a one-hot encoded vector.
The create_dataset function uses tf.data.Dataset to create a dataset from a list of image file paths, applies the read_image_and_label function to each file path, shuffles the data, and batches it.
Finally, the train_file_paths and val_file_paths variables contain lists of file paths to use for training and validation datasets respectively, and the train_data and val_data variables are the actual datasets created by calling create_dataset on those lists.
The error that I am getting is a TypeError and it says that in the function read_image_and_label(), on line 18, you are trying to convert a Tensor object to a Path object, but the Path() function expects a string, bytes, or os.PathLike object as its argument.
It seems like the file_path argument passed to the function read_image_and_label() is a Tensor object, and not a string or bytes object. This is because you are using the tf.data.Dataset.from_tensor_slices() method to create the dataset, which creates a dataset of slices from a tensor.
classes = ['aeroplane', 'bicycle', 'bird', 'boat', 'bottle',
'bus', 'car', 'cat', 'chair', 'cow',
'diningtable', 'dog', 'horse', 'motorbike', 'person',
'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor']
# Define the batch size and image size
batch_size = 32
img_size = (224, 224)
def read_image_and_label(file_path):
# Convert file_path to a Path object and get its name
file_name = Path(file_path).name
# Get the labels for the file from label_data
labels = label_data[file_name]
# Read the image file and resize it
img = tf.io.read_file(file_path)
img = tf.io.decode_jpeg(img, channels=3)
img = tf.image.resize(img, size=img_size, method='bicubic')
img = img / 255.0
# Convert the labels to a one-hot vector
label = tf.one_hot([classes.index(l) for l in labels], len(classes))
return img, label
# Define a function to create a dataset from a list of file paths
def create_dataset(file_paths):
# Create a dataset of file paths and apply the read_image_and_label function to each file path
dataset = tf.data.Dataset.from_tensor_slices(file_paths)
dataset = dataset.map(read_image_and_label)
# Shuffle and batch the dataset
dataset = dataset.shuffle(1000).batch(batch_size)
return dataset
# Define the train and validation file paths
train_file_paths = [os.path.join(data_dir, 'train', f) for f in os.listdir(os.path.join(data_dir, 'train')) if f.endswith('.jpg')]
val_file_paths = [os.path.join(data_dir, 'valid', f) for f in os.listdir(os.path.join(data_dir, 'valid')) if f.endswith('.jpg')]
# print(train_file_paths)
# Create the train and validation datasets
train_data = create_dataset(train_file_paths)
val_data = create_dataset(val_file_paths)
To resolve this I have tried using
def read_image_and_label(file_path):
# Convert file_path to a string
file_path = file_path.numpy().decode('utf-8')
# Convert file_path to a Path object and get its name
file_name = Path(file_path).name
# Get the labels for the file from label_data
labels = label_data[file_name]
# Read the image file and resize it
img = tf.io.read_file(file_path)
img = tf.io.decode_jpeg(img, channels=3)
img = tf.image.resize(img, size=img_size, method='bicubic')
img = img / 255.0
# Convert the labels to a one-hot vector
label = tf.one_hot([classes.index(l) for l in labels], len(classes))
return img, label
but this gives the error AttributeError: 'Tensor' object has no attribute 'numpy'
I have also tried this
def create_dataset(file_paths):
# Create a dataset of file paths and apply the read_image_and_label function to each file path
dataset = tf.data.Dataset.from_tensor_slices(file_paths)
dataset = tf.make_ndarray(dataset)
dataset = dataset.map(read_image_and_label)
# Shuffle and batch the dataset
dataset = dataset.shuffle(1000).batch(batch_size)
return dataset
but this gives the error AttributeError: 'TensorSliceDataset' object has no attribute 'tensor_shape'
What I want to achieve is tensor dataset for training and validation with image and label