3

We have a project following essentially this docker example with the only difference that we created a custom model similar to this whose code lies in a directory called forecast. We succeeded in running the model with mlflow run. The problem arises when we try to serve the model. After doing

mlflow models build-docker -m "runs:/my-run-id/my-model" -n "my-image-name"

we fail running the container with

docker run -p 5001:8080 "my-image-name"

with the following error:

ModuleNotFoundError: No module named 'forecast'

It seems that the docker image is not aware of the source code defining our custom model class. With Conda environnement the problem does not arise thanks to the code_path argument in mlflow.pyfunc.log_model.

Our Dockerfile is very basic, with just FROM continuumio/miniconda3:4.7.12, RUN pip install {model_dependencies}.

How to let the docker image know about the source code for deserialising the model and run it?

bruco
  • 141
  • 1
  • 12

1 Answers1

0

You can specify source code dependencies by setting code_paths argument when logging the model. So in your case, you can do something like:

mlflow.pyfunc.log_model(..., code_paths=[<path to your forecast.py file>])
Tomas
  • 189
  • 1
  • I tried to add the path to my model.py but that's still giving the same error. – bruco Jan 30 '20 at 16:23
  • 1
    I just discovered that it does work if I put the entire directory in code_path: `code_path=os.path.join('forecast')`. However the predictions fail with a strange error starting with `{"error_code": "MALFORMED_REQUEST", "message": "Failed to parse input as a Pandas DataFrame. Ensure that the input is a valid JSON-formatted Pandas DataFrame with the split orient produced using the pandas.DataFrame.to_json(..., orient='split') method."`, but the json is actually in split orientation... – bruco Jan 30 '20 at 16:58
  • I forgot to mention, the same json used with `mlflow models serve`(Conda environment) works without problems – bruco Jan 30 '20 at 17:04
  • Hmm, code_path argument should be a list and it should work with both files and directories. Can you share what error do you get if you set it as [ – Tomas Jan 31 '20 at 19:01
  • Can you share your mlflow version, conda env file for the model and part of your data that repros the problem? (the values and column names can be replaced) – Tomas Jan 31 '20 at 19:07
  • MLflow 1.4, conda.yaml of the model is `channels: - defaults - conda-forge dependencies: - python=3.7.4 - mlflow=1.4.0 - numpy=1.17.4 - scikit-learn=0.21.3 - cloudpickle=1.2.2` – bruco Feb 07 '20 at 17:30