How to parse an in-house TFRecords dataset when loading it using ImportExampleGen

Question

Following the official tutorial, this is how I should load a TFRecord dataset:

raw_image_dataset = tf.data.TFRecordDataset('images.tfrecords')

# Create a dictionary describing the features.
image_feature_description = {
    'height': tf.io.FixedLenFeature([], tf.int64),
    'width': tf.io.FixedLenFeature([], tf.int64),
    'depth': tf.io.FixedLenFeature([], tf.int64),
    'label': tf.io.FixedLenFeature([], tf.int64),
    'image_raw': tf.io.FixedLenFeature([], tf.string),
}

def _parse_image_function(example_proto):
  # Parse the input tf.train.Example proto using the dictionary above.
  return tf.io.parse_single_example(example_proto, image_feature_description)

parsed_image_dataset = raw_image_dataset.map(_parse_image_function)
parsed_image_dataset

The _parse_image_function is where I get the chance to set the type and shape of my loaded tensors.

But then, when I'm loading the same file using ImportExampleGen, I don't see how I can inject my parse function into the mix!

context = InteractiveContext()
example_gen = tfx.components.ImportExampleGen(input_base=_dataset_folder)
context.run(example_gen, enable_cache=True)

Does anyone know what's going to happen to my parse logic when I'm using the ImportExampleGen class instead of loading my dataset directly using TFRecordDataset class?

How to parse an in-house TFRecords dataset when loading it using ImportExampleGen

0 Answers0