3

I've stored my model in Github with MLProject so that others may run it in various variations. Now I would like to log the models created by runs as artifacts so that users can load them with MLModel. As my model is custom this is not as simple as with flavors such as spark. I first saved and loaded the model locally to make sure my Environment, model and artifact Code works. Now I want to include the Login of the model as part of the MLProject run from GitHub. As there is no example in the documentation I know of, I wanted to ask for help and suggest that this may be a good addition to the documentation or examples.

In Terms of Code I wrote the follwing inside the mlrun at the end:

Model depending on the Parameters specified in the Project run

 ETS_Exogen = ETS_Exogen(params=res.x, before=before,after=after)

logging the model with the previously defined model, evironment and artifacts

mlflow.pyfunc.log_model(python_model=ETS_Exogen, conda_env=conda_env,artifacts=artifacts)

Does mlflow.pyfunc.log_model automatically log the model into the artifacts of the runs or do I Need to define an artifact_path? Should I rather use mlflow.pyfunc.save_model ? I defined the artifacts paths so that they would be gathered from the GitHub repository as follows:

artifacts = { "exogen_variables":os.path.join(os.path.dirname(os.path.abspath(file)),"exogen_variables.csv") }

Is this correct? Links to Documentation on custom Models: https://mlflow.org/docs/latest/python_api/mlflow.pyfunc.html#pyfunc-create-custom-workflows

Karthik
  • 2,181
  • 4
  • 10
  • 28
MatthiasHerp
  • 271
  • 3
  • 9

2 Answers2

1

logging a model Needs a path, Standard is to store it in artifacts under the Folder models. The command is as follows:

mlflow.pyfunc.log_model(artifact_path="model",python_model=ETS_Exogen, conda_env=conda_env)

Here is how to add data in the model from a http Server. Dont use artifact but rather load it directly with Pandas in the context.

def load_context(self, context):
    import numpy as np
    import pandas as pd #data wrangeling
    url_to_exogen_raw = 'https://raw.githubusercontent.com/MatthiasHerp/ETS_Ex_BA_MLFlow/master/exogen_variables.csv'
    self.exogen = pd.read_csv(url_to_exogen_raw, index_col='date')
skibee
  • 1,279
  • 1
  • 17
  • 37
MatthiasHerp
  • 271
  • 3
  • 9
  • 1
    can you extend your answer and explain what you mean by _"Dont use artifact but rather load it directly with Pandas in the context."_? I am not entirely certain that it is clear how to produce these artifacts and where these artifacts are loaded from? Do the artifacts, e.g. pretrained weights need to reside on some external storage beforehand? – tafaust Oct 27 '21 at 08:00
0

Use mlflow log_artifact to log custom model

with mlflow.start_run(run_name='name'):
mlflow.log_artifacts("path/to/models_created", artifact_path="model")

details "mlflow.log_artifact() logs a local file or directory as an artifact, optionally taking an artifact_path to place it in within the run’s artifact URI. Run artifacts can be organized into directories, so you can place the artifact in a directory this way."

If we have trained pytorch model, we can use mlflow.pytorch.save_model to save in local location. Prior to this we need to have model which is of type torch.nn.Module and conda_env details. This create MLmodel and other required files. Then use mlflow.pytorch.log_model to log these saved artifacts. These files later be registered as models and used for serving.

shivaraj karki
  • 139
  • 1
  • 8