In a previous post I asked about saving and loading models with custom myflow.pyfunc objects and received an excellent answer from Daniel Schneider explaining the difference between mlflow.pyfunc.PythonModel and mlflow.pyfunc.PyFuncModel.
Here, I extend the question, as the proposed solution doesn't work for me when also trying to save and retrieve model artifacts.
I have a class with a 'fit' function that calculates some values that are saved to a dict, and a 'predict' function that uses the values. The predict function works before saving to ML flow, but not on subsequent re-loading.
Initially creating the class and running it outside MLFlow (using the solution proposed by Daniel Schneider of passing None into the predict function) works fine.
# dummy data
data = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data=data)
# create class
class PredictSpeciality(mlflow.pyfunc.PythonModel):
def fit(self):
print('fit')
d = {'mult': 2}
return d
def predict(self, context, X, d, y=None):
print('predict')
X['pred'] = X['col1'] * d['mult']
return X
# create instance of model, return weights dict and pass weights into predict function
m = PredictSpeciality()
d = m.fit()
m.predict(None, df, d)
However, saving and re-loading from MLFlow:
mlflow.pyfunc.save_model(path="temp_model", python_model=m)
m2 = mlflow.pyfunc.load_model("temp_model")
m2.predict(None, df, d)
Returns the following error:
predict() takes 2 positional arguments but 4 were given
I'm assuming this is again due to the differences outlined before between mlflow.pyfunc.PythonModel
and mlflow.pyfunc.PyFuncModel
but I'm not sure how to handle it.