0

a superimposed display for train/val splits using StatisticsGen

Hi,

I'm currently using tfx pipeline inside kubeflow. I struggle to have StatisticsGen showing a single graph with train and validation splits curves superimposed, allowing better comparaison distributions. this is exactly how tfdv.visualize_statistics(lhs_statistics=train_stats, rhs_statistics=eval_stats, lhs_name='train', rhs_name='eval') behaves (see illustration 1), and I would like StatisticsGen to also provide a superimposed splits graph.

Thanks for any reference or help so that i can move forward.

Regards

1 Answers1

0

You can use something like

# docs-infra: no-execute
# Compare evaluation data with training data
tfdv.visualize_statistics(lhs_statistics=eval_stats, rhs_statistics=train_stats,
                          lhs_name='EVAL_DATASET', rhs_name='TRAIN_DATASET')

From the tensorflow data validation tutorial

Pritam Dodeja
  • 177
  • 1
  • 8