Nested runs using MLflowClient

Question

In mlflow, you can run nested runs using the fluent projects API which are collapsable in the UI. E.g. by using the following code (see this for UI support):

with mlflow.start_run(nested=True):
  mlflow.log_param("mse", 0.10)
  mlflow.log_param("lr", 0.05)
  mlflow.log_param("batch_size", 512)
  with mlflow.start_run(nested=True):
    mlflow.log_param("max_runs", 32)
    mlflow.log_param("epochs", 20)
    mlflow.log_metric("acc", 98)
    mlflow.log_metric("rmse", 98)
  mlflow.end_run()

Due to database connection issues, I want to use a single mlflow client across my application.

How can I stack runs, e.g. for hyperparameter optimization, using created runs via MlflowClient().create_run()?

Here is a full example that uses MLFlow and Optuna with parallel MLFlow runs: https://simonhessner.de/mlflow-optuna-parallel-hyper-parameter-optimization-and-logging/ — Simon Hessner, Apr 19 '21 at 11:46

score 6 · Answer 1 · answered Apr 19 '21 at 10:49

It is a bit complicated to achieve, but I found a way by looking into the Fluent Tracking Interface that is used when you directly use the mlflow import.

In the start_run function you can see that a nested_run is just defined by setting a specific tag mlflow.utils.mlflow_tags.MLFLOW_PARENT_RUN_ID. Just set this to the run.info.run_id value of your parent run and it will be shown correctly in the UI.

Here is an example:

from mlflow.tracking import MlflowClient
from mlflow.utils.mlflow_tags import MLFLOW_PARENT_RUN_ID

client = MlflowClient()
try:
    experiment = client.create_experiment("test_nested")
except:
    experiment = client.get_experiment_by_name("test_nested").experiment_id
parent_run = client.create_run(experiment_id=experiment)
client.log_param(parent_run.info.run_id, "who", "parent")

child_run_1 = client.create_run(
        experiment_id=experiment,
        tags={
            MLFLOW_PARENT_RUN_ID: parent_run.info.run_id
        }
    )
client.log_param(child_run_1.info.run_id, "who", "child 1")

child_run_2 = client.create_run(
        experiment_id=experiment,
        tags={
            MLFLOW_PARENT_RUN_ID: parent_run.info.run_id
        }
    )
client.log_param(child_run_2.info.run_id, "who", "child 2")

In case you're wondering: The run name can also be specified that way, using the mlflow.utils.mlflow_tags.MLFLOW_RUN_NAME tag.

Nested runs using MLflowClient

1 Answers1