How to use a cv2 image augmentation function with tensorflow tf.data.Dataset?

Question

I am using tf.data.Dataset to create my dataset and training a CNN with keras. I need to apply masks on the images, and the mask depends on the shape of the image, there are no predefined pixel coordinates.

When looking for an answer on the internet, I found that there are 2 ways of accessing shapes of images in TensorFlow (in training time):

Using eager execution (which is not enabled by default in my case, I'm using tf v 12.0)
Using a session

I do not want to use eager execution because it slows down training, and cannot use a session because I train and test the CNN using Keras (I feed the data to model.train() using iterators of tf.data.Dataset).

As a consequence, I have no way of knowing the shapes of images, and thus cannot access specific pixels for data augmentation.

I wrote a function using OpenCV (cv2) that applies the masks. Is there a way to integrate it with the TensorFlow data pipeline?

EDIT : I found a solution. I used tf.py_func to wrap the python functions

This question is off-topic here. Programming issues are generally off-topic here. See [https://ai.stackexchange.com/help/on-topic](https://ai.stackexchange.com/help/on-topic) for more details. I will migrate this question to Stack Overflow. — nbro, Mar 27 '20 at 16:58

score 1 · Answer 1 · answered Mar 27 '20 at 17:20

NOTE: Since you need image augmentation, I thought of supplying with some information on various image-augmentation libraries. This does not show you how to add OpenCV function into your tfdata-pipeline. But, if your requirements are standard enough, you may be able to use one of these:

tf.keras.preprocessing.image.ImageDataGenerator
imaug
albumentations

Data Augmentation in Python

Package: albumentations
library: external
url: Python albumentations library
Package: imaug :star:
library: external
url: Python imaug library
Package: tf.keras.preprocessing.image.ImageDataGenerator
library: external
url: Pyhon - TensorFlow ImageDataGenerator library

Examples

Example(s)/use of albumentations.
- url: Example use-cases of Albumentations
Example(s)/use of imaug.
- url: Data Augmentation for Deep Learning :star::page_facing_up::heavy_check_mark: Fantastic Article
- url: Data Augmentation techniques in python
Example(s)/use of tf.keras.preprocessing.image.ImageDataGenerator.
- url: Official Example use-case of tf.keras - ImageDataGenerator
- url: Building powerful image classification models using very little data

Thanks for all these references! Do you know how to integrate imaug with the tensorflow input pipline (tfdata.Dataset)? — S.E.K., Mar 29 '20 at 11:44
Not exactly an answer to your question though; you might find this useful: https://www.kaggle.com/rsk2327/densenet-imaug. See for iaa. in the code. Mostly it is in get_seq() function. — CypherX, Mar 30 '20 at 01:26

score 1 · Accepted Answer · answered Mar 29 '20 at 15:19

You can use map to transform elements of your dataset. You can then use tf.py_function to wrap your cv2 function into a tf op that executes eagerly. In tensorflow 1.x, you may use tf.py_func but the behavior is a bit different. See tf.py_function documentation for more info.

So, in TF-2.x it will look something like:

def cv2_func(image, label):
    # your code goes here

def tf_cv2_func(image, label):
    [image, label] = tf.py_function(cv2_func, [image, label], [tf.float32, tf.float64])
    return image, label

train_ds = train_ds.shuffle(BUFFER_SIZE).map(tf_cv2_func).batch(BATCH_SIZE)

How to use a cv2 image augmentation function with tensorflow tf.data.Dataset?

2 Answers2

Data Augmentation in Python

Examples