0

I'm using Python SDK, and my goal is to download output files from a pipeline step run. However, I have only been able to access the (global) pipeline logs, but not the logs of the individual steps. Here is my code at the moment:

train_exp = ws.experiments.get('scheduled-train-pipeline')
# Get last run
run = [i for i in train_exp.get_runs()]
run[0].get_file_names()

I need to access the child step of the pipeline run, and then download the logs of this step

Jorge LG
  • 13
  • 2

1 Answers1

0

I think you need this classes and methods to achieve desired results:

experiment.get_runs()

run.download_files()

run.get_file_names()

run.get_children()

Make sure you have a valid workspace object ws, and also note this code snippet is a pseudocode and I didn't test it. It should give you and idea of the approach and you need to adjust code for your needs using documentations above I provided:

from azureml.core import Experiment, Workspace

# get the list of runs of an experiment:
experiment = Experiment(ws, experiment_name)
run_ids_list = []
for run in experiment.get_runs():
        run_ids_list.append(run.id)
        # you probably should limit current loop with amount of runs you want to retrieve

# then loop over list of run_ids_list:
for run_id in run_ids_list:
        pipeline_run = ws.get_run(run_id)
        for child_run in pipeline_run.get_children():
                files = child_run.get_file_names() # so you have list of files for future processing
                child_run.download_files(
                        prefix="outputs/",
                        output_directory=<where you want to save it>,
                        )

Good luck!

Sysanin
  • 1,501
  • 20
  • 27