I want to log model with custom predict. Example of signature
from sklearn.ensemble import RandomForestRegressor
class CustomForest(RandomForestRegressor):
def predict(self, X, second_arg=False):
pred = super().predict(X)
value = 1 if second_arg else 0
return pred, value
This model is saved in file model.py
. From here I get the idea to create wrapper to get access to the:
class WrapperPythonModel(mlflow.pyfunc.PythonModel):
"""
Class to train and use custom model
"""
def load_context(self, context):
"""This method is called when loading an MLflow model with pyfunc.load_model(), as soon as the Python Model is constructed.
Args:
context: MLflow context where the model artifact is stored.
"""
import joblib
self.model = joblib.load(context.artifacts["model_path"])
def predict(self, context, model_input):
"""This is an abstract function. We customized it into a method to fetch the model.
Args:
context ([type]): MLflow context where the model artifact is stored.
model_input ([type]): the input data to fit into the model.
Returns:
[type]: the loaded model artifact.
"""
return self.model
And here is how I save and log it:
model = CustomForest()
model.fit(X, y)
model_path = 'model.pkl'
joblib.dump(model, 'model.pkl')
artifacts = {"model_path": model_path}
with mlflow.start_run() as run:
mlflow.pyfunc.log_model(
artifact_path=model_path,
registered_name='model'
python_model=WrapperPythonModel(),
code_path=["models.py"],
artifacts=artifacts,
)
But when I load it and deploy on another machine, I get error module models.py not found
. How can I fix that? I thought specifying code_path
parameter fixes that issues with absent local files.