I am quite new to MLFlow. I was using a Hashing TF-IDF vectorizer and a Logistic Regression Model(from pyspark ML) for working on a basic NLP problem. I am using MLFlow to track the model training and to log the model. I need to use this model in another place for a batch prediction and I had logged only the final model. My question is, do we need to log the hashing TF-IDF vectorization model(as a separate artifact) to MlFlow as well, during training, in-addition to the ML model. Currently, I have put the Hashing TF-IDf in Spark Pipeline along with the model when logging the model. Not sure if this is an efficient way.
Asked
Active
Viewed 79 times