7

I am currently tracking my MLflow runs to a local file path URI. I would also like to set up a remote tracking server to share with my collaborators. One thing I would like to avoid is to log everything to the server, as it might soon be flooded with failed runs.

Ideally, I'd like to keep my local tracker, and then be able to send only the promising runs to the server.

What is the recommended way of copying a run from a local tracker to a remote server?

Daniel P
  • 199
  • 2
  • 15
  • In my experience, trying to censor whether a run is worth logging through additional trackers may be a waste of time. Track as you go, then clean your runs if they are no longer needed. If you really want migration, there are extra things to be considered like differences in backends. Is your local built on files and the centralized one built on database and cloud storage? Also, how often - migration should be always done carefully? – Maciej Skorski May 19 '23 at 04:25

2 Answers2

0

I have been interested in a related capability of copying runs from one experiment to another for a similar reason, ie set one area for arbitrary runs and another into which the results for promising runs that we move forward with are moved. Your scenario with separate tracking server is just the generalization of mine. Either way, apparently there is not a feature for this capability built-in to Mlflow currently. However, the mlflow-export-import python-based tool looks like it may cover both our use cases, and it cites usage on both Databricks and the open-source version of Mlflow, and it appears current as of this writing. I have not tried using this tool yet myself though - if/when I try it I'm happy to jot a follow-up here saying whether it worked well for this purpose, and/or anyone else could do same. Thanks and cheers!

AndyG
  • 1
  • 1
  • Did this ever work for you? I am having training crash because of intermittent connectivity issues to databricks (so, log functions error out). Would like to run a server locally and back up to databricks every night. – illan Feb 24 '23 at 18:10
-1

To publish your trained model to a remote MLflow server you should use 'register_model' API. For example, if you are using spacy flavor of MLflow you can use as below, where 'nlp' is the trained model:

    mlflow.spacy.log_model(spacy_model=nlp, artifact_path='mlflow_sample')
    model_uri = "runs:/{run_id}/{artifact_path}".format(
        run_id=mlflow.active_run().info.run_id, artifact_path='mlflow_sample'
    )
    mlflow.register_model(model_uri=model_uri, name='mlflow_sample')

Make sure that the following environment variables should be set. In below example S3 storage is used:

SET MLFLOW_TRACKING_URI=https://YOUR-REMOTE-MLFLOW-HOST
SET MLFLOW_S3_BUCKET=s3://YOUR-BUCKET-NAME
SET AWS_ACCESS_KEY_ID=YOUR-ACCESS-KEY
SET AWS_SECRET_ACCESS_KEY=YOUR-SECRET-KEY
Boris Modylevsky
  • 3,029
  • 1
  • 26
  • 42
  • My question is how does one "publish" the model to the centralized instance of MLflow from the local MLflow registry, without running the training or validation again. – Daniel P May 05 '21 at 15:49