0

I am trying to use TensorFlow Extended for building a pipeline for my image classification model. I am reading and transforming images from local directory with following code:

train_datagen = ImageDataGenerator(rescale=1.0/255.,
                                     rotation_range=40,
                                     width_shift_range=0.2,
                                     height_shift_range=0.2,
                                     shear_range=.2,
                                     zoom_range=.2,
                                     horizontal_flip=True,
                                     fill_mode='nearest')

train_generator = train_datagen.flow_from_directory(directory=train_data_path,
                                                      batch_size=32, 
                                                      class_mode='categorical',
                                                      target_size=(150, 150))

validation_datagen = ImageDataGenerator(rescale=1.0/255.,
                                       rotation_range=40,
                                       width_shift_range=0.2,
                                       height_shift_range=0.2,
                                       shear_range=0.2,
                                       zoom_range=0.2,
                                       horizontal_flip=True,
                                       fill_mode='nearest')
validation_generator = validation_datagen.flow_from_directory(directory=test_data_path,
                                                                batch_size=32, 
                                                                class_mode='categorical',
                                                                target_size=(150, 150))

Directory with data looks like this:

.
└── Data
    ├── test
    ├── train
    │   ├── buildings
    │   ├── forest
    │   ├── glacier
    │   ├── mountain
    │   ├── sea
    │   └── street
    └── validation
        ├── buildings
        ├── forest
        ├── glacier
        ├── mountain
        ├── sea
        └── street

Now, almost all TensorFlow Extended tutorial or documentation provides example to read and transform data from a CSV file using CsvExampleGen as following:

_data_root = tempfile.mkdtemp(prefix='tfx-data')
DATA_PATH = 'https://raw.githubusercontent.com/tensorflow/tfx/master/tfx/examples/chicago_taxi_pipeline/data/simple/data.csv'
_data_filepath = os.path.join(_data_root, "data.csv")
urllib.request.urlretrieve(DATA_PATH, _data_filepath)

context = InteractiveContext()

example_gen = tfx.components.CsvExampleGen(input_base=_data_root)
context.run(example_gen, enable_cache=True)

I could not find a proper way to make a pipeline for reading and transforming image dataset from folder. Does anyone has any better solution/tutorial/documentation regarding this issue?

Protik Nag
  • 511
  • 5
  • 20

0 Answers0