I am testing the metrics display in Vertex (Kubeflow UI for a pipeline running on GCP. I use the reusable component method with component specified in YAML, not function based components. The component is containerized and pushed to artifact registry and I reference it through YAML specification. I verified that the file '/mlpipeline-metrics.json' is correctly created in container but the KBF UI doesn't show the metric (accuracy in this case). I am able to export the metrics to outputPath but they are not also displayed into UI from the above local json.
I've ensured that both the metrics artifact is correctly named: "mlpipeline-metrics" and the file saved in the root of the container: "mlpipeline-metrics.json". Still the Kubeflow pipeline doesn't display the metrics on RUN view.
This is the code:
def produce_metrics(
mlpipeline_metrics):
accuracy = 0.9
metrics = {
'metrics': [{
'name': 'accuracy-score', #
'numberValue': accuracy, #
'format': "PERCENTAGE", #
}]
}
# save to mlpipeline-metrics.json file in the root
with open('/mlpipeline-metrics.json', 'w') as f:
json.dump(metrics, f)
# save to artifact path
with open(mlpipeline_metrics + '.json', 'w') as f:
json.dump(metrics, f)
def main_fn(arguments):
training_table_bq = arguments.training_table
validation_table_bq = arguments.validation_table
schema_dict = generate_schema(training_table_bq)
target_name = arguments.target_name
gcs_model_path = arguments.gcs_model_path
mlpipeline_metrics = arguments.mlpipeline_metrics
# run train evaluate
gcs_model_path = train_evaluate(training_table_bq, validation_table_bq, schema_dict, target_name, gcs_model_path, mlpipeline_metrics)
return gcs_model_path
if __name__ == '__main__':
parser = argparse.ArgumentParser(description = "train evaluate model")
parser.add_argument("--training_table", type = str, help = 'Name of the input training table')
parser.add_argument("--validation_table", type = str, help = 'Name of the input validation table')
parser.add_argument("--target_name", type = str, help = 'Name of the target variable')
parser.add_argument("--gcs_model_path", type = str, help = 'output directory where the model is saved')
parser.add_argument('--mlpipeline_metrics',
type=str,
required = False,
default='/mlpipeline-metrics.json',
help='output path for the file containing metrics JSON structure.')
args = parser.parse_args()
training.yaml
============
name: training
description: Scikit trainer. Receives the name of train and validation BQ tables. Train, evaluate and save a model.
inputs:
- {name: training_table, type: String, description: 'name of the BQ training table'}
- {name: validation_table, type: String, description: 'name of the BQ validation table'}
- {name: target_name, type: String, description: 'Name of the target variable'}
- {name: max_depth, type: Integer, description: 'max depth'}
- {name: learning_rate, type: Float, description: 'learning rate'}
- {name: n_estimators, type: Integer, description: 'n estimators'}
outputs:
- {name: gcs_model_path, type: OutputPath, description: 'output directory where the model is saved'}
- {name: MLPipeline_Metrics, type: Metrics, description: 'output directory where the metrics are saved'}
implementation:
container:
image: ..... train_comp:latest
command: [
/src/component/train.py,
--training_table, {inputValue: training_table},
--validation_table, {inputValue: validation_table},
--target_name, {inputValue: target_name},
--gcs_model_path, {outputPath: gcs_model_path},
--mlpipeline_metrics_path, {outputPath: MLPipeline_Metrics},
--max_depth, {inputValue: max_depth},
--learning_rate, {inputValue: learning_rate},
--n_estimators, {inputValue: n_estimators}
]`