8

In mlflow, you can run nested runs using the fluent projects API which are collapsable in the UI. E.g. by using the following code (see this for UI support):

with mlflow.start_run(nested=True):
  mlflow.log_param("mse", 0.10)
  mlflow.log_param("lr", 0.05)
  mlflow.log_param("batch_size", 512)
  with mlflow.start_run(nested=True):
    mlflow.log_param("max_runs", 32)
    mlflow.log_param("epochs", 20)
    mlflow.log_metric("acc", 98)
    mlflow.log_metric("rmse", 98)
  mlflow.end_run()

Due to database connection issues, I want to use a single mlflow client across my application.

How can I stack runs, e.g. for hyperparameter optimization, using created runs via MlflowClient().create_run()?

Julian L.
  • 401
  • 4
  • 12
  • 1
    Here is a full example that uses MLFlow and Optuna with parallel MLFlow runs: https://simonhessner.de/mlflow-optuna-parallel-hyper-parameter-optimization-and-logging/ – Simon Hessner Apr 19 '21 at 11:46

1 Answers1

6

It is a bit complicated to achieve, but I found a way by looking into the Fluent Tracking Interface that is used when you directly use the mlflow import.

In the start_run function you can see that a nested_run is just defined by setting a specific tag mlflow.utils.mlflow_tags.MLFLOW_PARENT_RUN_ID. Just set this to the run.info.run_id value of your parent run and it will be shown correctly in the UI.

Here is an example:

from mlflow.tracking import MlflowClient
from mlflow.utils.mlflow_tags import MLFLOW_PARENT_RUN_ID

client = MlflowClient()
try:
    experiment = client.create_experiment("test_nested")
except:
    experiment = client.get_experiment_by_name("test_nested").experiment_id
parent_run = client.create_run(experiment_id=experiment)
client.log_param(parent_run.info.run_id, "who", "parent")

child_run_1 = client.create_run(
        experiment_id=experiment,
        tags={
            MLFLOW_PARENT_RUN_ID: parent_run.info.run_id
        }
    )
client.log_param(child_run_1.info.run_id, "who", "child 1")

child_run_2 = client.create_run(
        experiment_id=experiment,
        tags={
            MLFLOW_PARENT_RUN_ID: parent_run.info.run_id
        }
    )
client.log_param(child_run_2.info.run_id, "who", "child 2")

In case you're wondering: The run name can also be specified that way, using the mlflow.utils.mlflow_tags.MLFLOW_RUN_NAME tag.

Simon Hessner
  • 1,757
  • 1
  • 22
  • 49