I have a question related to gensim. I like to know whether it is recommended or necessary to use pickle while saving or loading a model (or multiple models), as I find scripts on GitHub that do either.
mymodel = Doc2Vec(documents, size=100, window=8, min_count=5, workers=4)
mymodel.delete_temporary_training_data(keep_doctags_vectors=True, keep_inference=True)
See here
Variant 1:
import pickle
# Save
mymodel.save("mymodel.pkl") # Stores *.pkl file
# Load
mymodel = pickle.load("mymodel.pkl")
Variant 2:
# Save
model.save(mymodel) # Stores *.model file
# Load
model = Doc2Vec.load(mymodel)
In gensim.utils
, it appears to me that there is a pickle function embedded: https://github.com/RaRe-Technologies/gensim/blob/develop/gensim/utils.py
def save ... try: _pickle.dump(self, fname_or_handle, protocol=pickle_protocol) ...
Goal of my question: I would be glad to learn 1) whether I need pickle (for better memory management) and 2) in case, why it's better than loading *.model files.
Thank you!