1

I want to create image sequence samples using the tf.data API. But as of now, it seems like there is no easy way to concatenate multiple images to form a single sample. I have tried to use the dataset.window function, which groups my images right. But I don't know how to concatenate them.

import tensorflow as tf
from glob import glob

IMG_WIDTH = 256
IMG_HEIGHT = 256

def load_and_process_image(path):
    img = tf.io.read_file(path)
    img = tf.image.decode_jpeg(img, channels=3)
    img = tf.image.resize(img, [IMG_WIDTH, IMG_HEIGHT])
    img = tf.reshape(img, shape=(IMG_WIDTH, IMG_HEIGHT, 1, 3))
    return img

def create_dataset(files, time_distance=8, frame_step=1):
    dataset = tf.data.Dataset.from_tensor_slices(files)
    dataset = dataset.map(load_and_process_image)
    dataset = dataset.window(time_distance, 1, frame_step, True)

    # TODO: Concatenate elements from dataset.window
    return dataset

files = sorted(glob('some/path/*.jpg'))
images = create_dataset(images)

I know that I could save my image sequences as TFRecords but that would make my data pipeline much more unflexible and would cost tons of memory.

My input batches should have the form N x W x H x T x C (N: Number of samples W: Image Width H: Image Height T: Image Sequence length C: Image Channels).

1 Answers1

0

You can use batching to create batches of size N.

iterations = # 
batched_dataset = dataset.batch(N) 
for batch in batched_dataset.take(iterations):
    # process your batch

Here iterations is the number of batches you want to generate.

samu
  • 1,936
  • 4
  • 22
  • 26
  • That is not what I want to do. I want to stack N tensors along my designated time axis. So I want to map a MapDataset to a single Tensor. – Marc Seibert Jan 19 '20 at 14:53