2

I am trying to integrate together KerasTuner and Mlflow. I'd like to record the loss at each epoch of each trial of Keras Tuner.

My approach is:


class MlflowCallback(tf.keras.callbacks.Callback):
    
    # This function will be called after each epoch.
    def on_epoch_end(self, epoch, logs=None):
        if not logs:
            return
        # Log the metrics from Keras to MLflow     
        mlflow.log_metric("loss", logs["loss"], step=epoch)
    

from kerastuner.tuners import RandomSearch

with mlflow.start_run(run_name="myrun", nested=True) as run:
  
  tuner = RandomSearch(
      train_fn,
      objective='loss',
      max_trials=25, 
  )
  tuner.search(train,
              validation_data=validation, 
              validation_steps=validation_steps,
              steps_per_epoch=steps_per_epoch, 
              epochs=5, 
              callbacks=[MlflowCallback()]
  )

However, the loss values are reported (sequentially) in one single experiment. Is there a way to record them independently? Loss values

Innat
  • 16,113
  • 6
  • 53
  • 101
Titus Pullo
  • 3,751
  • 15
  • 45
  • 65

2 Answers2

0

The line

with `mlflow.start_run(run_name="myrun", nested=True)` as run:

is what's causing that each training is stored in the same experiment. Don't use it and mlflow will automatically create one different experiment for each training performed by the tuner.search

Arturo Moncada-Torres
  • 1,236
  • 14
  • 23
0

The answer is pretty simple, instead of using Callback, you need to subclass the HyperModel from KerasTuner like this:

# Create a HyperModel subclass
class SGNNHyperModel(keras_tuner.HyperModel):

    def build(self, hp):
        # Create your model, set some hyper-parameters here
        model = SomeModel()

        return model

    def fit(self, hp, model, *args, **kwargs):
        with mlflow.start_run():
            mlflow.log_params(hp.values)
            mlflow.tensorflow.autolog()
            return model.fit(*args, **kwargs)

and use this class like this:

tuner = BayesianOptimization(
    SGNNHyperModel(),
    max_trials=20,
    # Do not resume the previous search in the same directory.
    overwrite=True,
    objective="val_loss",
    # Set a directory to store the intermediate results.
    directory="/tmp/tb",
)

Reference:
https://medium.com/@m.nusret.ozates/using-mlflow-with-keras-tuner-f6df5dd634bc