Saving NLP vectorization models in MLFlow Databricks

Asked Oct 07 '22 at 09:25

Active Oct 07 '22 at 09:25

Viewed 79 times

I am quite new to MLFlow. I was using a Hashing TF-IDF vectorizer and a Logistic Regression Model(from pyspark ML) for working on a basic NLP problem. I am using MLFlow to track the model training and to log the model. I need to use this model in another place for a batch prediction and I had logged only the final model. My question is, do we need to log the hashing TF-IDF vectorization model(as a separate artifact) to MlFlow as well, during training, in-addition to the ML model. Currently, I have put the Hashing TF-IDf in Spark Pipeline along with the model when logging the model. Not sure if this is an efficient way.

asked Oct 07 '22 at 09:25

john_ds_dev

Saving NLP vectorization models in MLFlow Databricks

0 Answers0