0

Run a SageMaker pipeline from SageMaker studio, each pipeline step (e.g., ProcessingStep, TrainingStep, LambdaStep) has an output tab,

enter image description here

I tried to search for the SDK doc, but couldn't find anything related. How may I display metrics from custom processing, training or lambda containers to the tab?

PolarStorm
  • 21
  • 3

1 Answers1

0

For built-in algorithms

In the case of built-in algorithms, I refer to the official "Define Metrics" guide (chapter "Using a built-in algorithm for training").

For Custom algorithms

The problem is solved basically in 2 steps:

  1. Within your script (e.g., the training script) you will need to log the metric to intercept.

    Trivially a print/log:

    print(f"New best val_loss score: {your_metric}") 
    
  2. Within the definition of your pipeline component you should set the metric_definitions parameter.

    For example in Estimators.

metric_definitions (list[dict[str, str] or list[dict[str, PipelineVariable]]) – A list of dictionaries that defines the metric(s) used to evaluate the training jobs. Each dictionary contains two keys: ‘Name’ for the name of the metric, and ‘Regex’ for the regular expression used to extract the metric from the logs. This should be defined only for jobs that don’t use an Amazon algorithm.

To use it for the above example, it will then suffice to define:

metric_definitions=[
    {'Name': 'val_loss', 'Regex': 'New best val_loss score: ([0-9\.]+)'}
]

P.S.: Remember that estimators also mean derived classes such as SKLearn, PyTorch, etc...


At this point, at the reference step where you have defined the metrics to be intercepted, you will find a key-value pair, of the last value intercepted, in the SageMaker Studio screen and also a graph to monitor progress (even during training) in cloudwatch metrics.

enter image description here

Giuseppe La Gualano
  • 1,491
  • 1
  • 4
  • 24
  • Thanks, it makes sense for `Estimator` and I'll give it a try. How about other steps like `ProcessorStep` or `LambdaStep`, I didn't find a `metric_definitions` for those APIs. – PolarStorm Jan 10 '23 at 03:49
  • You may find it useful to read this related question: [Metrics for any step of Sagemaker pipeline (not just TrainingStep)](https://stackoverflow.com/questions/74304758/metrics-for-any-step-of-sagemaker-pipeline-not-just-trainingstep) – Giuseppe La Gualano Jan 10 '23 at 14:06