0

I am using pickle to save classified model with bayes theorem, I have saved a file with 2.1 GB after classification with 5600 records. but when i loading that file it is taking nearly 2 minutes but for classifying some text it is taking 5.5 minutes. I am using following code to load it and classify.

classifierPickle = pickle.load(open( "classifier.pickle", "rb" ) )
   classifierPickle.classify("want to go some beatifull work place"))

First line for loading pickle object and second one for classifying text it results which topic(Category) it is. I am using following code to save model.

file = open('C:/burberry_model/classifier.pickle','wb')
pickle.dump(object,file,-1)

Every thing i am using from textblob.Environment is Windows,28GB RAM,four core CPU's . It would very help full if any one can resolve this issue.

Balaji
  • 43
  • 5

1 Answers1

0

Since textblob is built on top of NLTK, it is a pure Python implementation which reduces it's speed by a huge magnitude. Secondly, since your Pickle file is 2.1GB, that makes it expand much more and saved directly on the RAM, increasing the time even more.

Also, since you're using Windows, Python speed is relatively slower. If speed is a main concern for you, then it would be useful to use the feature selector and vector constructor from textblob/NLTK and use scikit-learn NB Classifier which has C-Bindings, so i'm guessing it would be significantly faster.

Ankit Vadehra
  • 157
  • 1
  • 11