7

What is the value of nested runs in mlflow? I thought it would be that a child run inherits params of the parent, but I dont see that

with mlflow.start_run(run_name='myrun'):
    mlflow.log_param('kl', '0p0')
    mlflow.log_param('name', 'ios')
    mlflow.log_metric('mu', 1.0)
    with mlflow.start_run(run_name='myrun2', nested=True):
        mlflow.log_param('name', 'weighted')        
        mlflow.log_metric('mu', 2.0)

if I collect the run info in python

df = mlflow.search_runs()

then we have

df['params.kl']

giving

0    None
1     0p0
Name: params.kl, dtype: object
MrCartoonology
  • 1,997
  • 4
  • 22
  • 38

2 Answers2

2

From my understanding, the reason for nested runs are to track a collection of model training within a single run. This would have the structure: experiment --> run 1, run 2, run 3, ... --> run 1-1, run 1-2, run 2-1, run 2-2, run 3-1, run 3-2,...

In other words, the parent/outer mlflow.start_run generates a mlflow experiment entry (first-level run); the child/nested mlflow.start_run generates run-entry (second-level run).

1

Nested runs are useful in the case of Cross-Fold validation or Hyperparameter tuning, when you're performing iterative validation of models.

Take for example:

from sklearn.datasets import load_iris
from sklearn.model_selection import cross_val_score, KFold
from sklearn.linear_model import LogisticRegression
from sklearn import metrics
import numpy as np
import mlflow
import mlflow.sklearn

# Load the iris dataset
iris = load_iris()
X = iris.data
y = iris.target

# Initialize a model
model = LogisticRegression()

# Initialize cross validation
kf = KFold(n_splits=5)

with mlflow.start_run(run_name="Iris Logistic Regression") as parent_run:
    # Log parameter, metrics, and model to MLflow
    mlflow.log_param("Model", "Logistic Regression")
    
    for fold, (train_index, test_index) in enumerate(kf.split(X)):
        X_train, X_test = X[train_index], X[test_index]
        y_train, y_test = y[train_index], y[test_index]

        model.fit(X_train, y_train)
        predictions = model.predict(X_test)

        # Start nested MLflow run
        with mlflow.start_run(run_name=f"Fold {fold}", nested=True):
            # Log metrics
            mlflow.log_metric("Accuracy", metrics.accuracy_score(y_test, predictions))
            mlflow.log_metric("Precision", metrics.precision_score(y_test, predictions, average='micro'))
            mlflow.log_metric("Recall", metrics.recall_score(y_test, predictions, average='micro'))

            # Save the model to the current nested run's artifact directory
            mlflow.sklearn.log_model(model, "model")

In this example, we want to track the model metrics on each fold, but we want the partitioned fold-level metrics to correspond to the same parent Logistic Regression run. In this way, we keep related nested run information available to the same model.

An HPO example would be similar, with the addition of hyperparameters that are tracked and maybe model artifact checkpoints if you're doing something fancier.

Dave Liu
  • 906
  • 1
  • 11
  • 31