I know of two ways to run a TFX pipeline. First, using a Jupyter notebook with InteractiveContext
in a browser:
from tfx import v1 as tfx
from tfx.orchestration.experimental.interactive.interactive_context import InteractiveContext
context = InteractiveContext(pipeline_root=_pipeline_data_folder)
example_gen = tfx.components.ImportExampleGen(input_base=_dataset_folder)
context.run(example_gen, enable_cache=True)
statistics_gen = tfx.components.StatisticsGen(examples=example_gen.outputs['examples'])
context.run(statistics_gen, enable_cache=True)
context.show(statistics_gen.outputs['statistics'])
This way, I can see the statistics artifact in the browser:
The second way to run the same pipeline is by using a python script (no browser involved):
example_gen = tfx.components.ImportExampleGen(input_base=_dataset_folder)
statistics_gen = tfx.components.StatisticsGen(examples=example_gen.outputs['examples'])
components = [
example_gen,
statistics_gen,
]
pipeline = tfx.dsl.Pipeline(
pipeline_name='sample_pipeline',
pipeline_root=_pipeline_data_folder,
metadata_connection_config=tfx.orchestration.metadata.sqlite_metadata_connection_config(
f'{_pipeline_data_folder}/metadata.db'),
components=components)
tfx.orchestration.LocalDagRunner().run(pipeline)
I understand that since there's no browser involved in the second approach, asking for a visualization is pointless. But the same artifact that was created in the first approach was also create in the second one. So my question is, after the second pipeline is over, how can visualize the created statistics artifact?