I am trying to create an image augmentation pipeline for an object detection network, where my training examples are augmented as they go into the network. The images and the bounding boxes need to be augmented, but the standard tf.image methods don't work with bounding box data.
All the easy augmentation libraries that work with bounding boxes need numpy arrays but I don't know how to convert my Tensors into numpy arrays inside my .map() function. Even when I wrap my augment function in a tf.py_function call I still get the error AttributeError: 'Tensor' object has no attribute 'numpy'
when I try to convert my image via image = image.numpy()
.
my dataset is loaded via this:
def load_tfrecord_dataset(file_pattern, class_file, size=416):
LINE_NUMBER = -1
class_table = tf.lookup.StaticHashTable(tf.lookup.TextFileInitializer(
class_file, tf.string, 0, tf.int64, LINE_NUMBER, delimiter="\n"), -1)
files = tf.data.Dataset.list_files(file_pattern)
dataset = files.flat_map(tf.data.TFRecordDataset)
return dataset.map(lambda x: tf.py_function(parse_tfrecord(x, class_table, size), [x], tf.float32))
# return dataset.map(lambda x: parse_tfrecord(x, class_table, size))
this calls my parsing function:
def parse_tfrecord(tfrecord, class_table, size):
x = tf.io.parse_single_example(tfrecord, IMAGE_FEATURE_MAP)
x_train = tf.image.decode_jpeg(x['image/encoded'], channels=3)
x_train = tf.image.resize(x_train, (size, size))
class_text = tf.sparse.to_dense(
x['image/object/class/text'], default_value='')
labels = tf.cast(class_table.lookup(class_text), tf.float32)
y_train = tf.stack([tf.sparse.to_dense(x['image/object/bbox/xmin']),
tf.sparse.to_dense(x['image/object/bbox/ymin']),
tf.sparse.to_dense(x['image/object/bbox/xmax']),
tf.sparse.to_dense(x['image/object/bbox/ymax']),
labels], axis=1)
x_train, y_train = tf.py_function(augment_images(x_train, y_train), [], tf.uint8)
paddings = [[0, FLAGS.yolo_max_boxes - tf.shape(y_train)[0]], [0, 0]]
y_train = tf.pad(y_train, paddings)
return x_train, y_train
which calls my augment function:
def augment_images(image, boxes):
image = image.numpy()
seq = iaa.Sequential([
iaa.Fliplr(0.5),
iaa.Flipud(0.5)
])
image, label = seq(image=image, bounding_boxes=boxes)
return image, label
But no matter which parts of the code I wrap in a tf.py_function
or where I try to convert to a numpy array, I always get the same error.
What am I doing wrong?