-2

Task : Doing Document Classification with CountVectorizer and TfidfTransformer using SVC.

I've trained the model, tested it and saved the model along with CountVectorizer and TfidfTransformer using pickle.

Now when I load and use this to predict it gives AttributeError: dense not found.

I checked the shape of inputs in both training and testing phase. its the same. But I think shape is not the issue.

There is no version updates done in between training and using it by loading it. Everything is on the same version scikit-learn==0.23.2.

Here is my code:

#Training
X_train = train["Text"]
Y_train = train["Class"]

count = CountVectorizer(lowercase=False)
X_count_vector = count.fit_transform(X_train)

tfidf = TfidfTransformer(smooth_idf=True, use_idf=True)
X_train = tfidf.fit_transform(X_count_vector)

classifier = SVC()
classifier.fit(X_train.todense(), Y_train) #Training as dense matrix

#Testing
cvec = count.transform(test["Text"])
predable = tfidf.transform(cvec)
pred = classifier.predict(predable.todense()) #This line works, it gived the predicted values as expected.

#Saving the model
pickle.dump(classifier, open("classifier.txt", 'wb'))
pickle.dump(count, open("countVec.txt", 'wb'))
pickle.dump(tfidf, open("tfidf.txt", 'wb'))

#Loading the model
classifiernew = pickle.load(open("classifier.txt", 'rb'))
countVectornew = pickle.load(open("countVec.txt", 'rb'))
tfidfnew = pickle.load(open("tfidf.txt", 'rb'))

#Using the Loaded Model
newInp = test["Text"]
countVec = countVectornew.transform(newInp)
X_test = tfidfnew.transform(countVec)
#Prediction, this is where the Error enters in
predd = classifiernew.predict(X_test.todense()) #The error causing line

Here is the full Traceback:

res = clf.predict(testable.dense())
Traceback (most recent call last):

  File "<ipython-input-74-d4714966beb3>", line 1, in <module>
    res = clf.predict(testable.dense())

  File "...\env\lib\site-packages\scipy\sparse\base.py", line 687, in __getattr__
    raise AttributeError(attr + " not found")

AttributeError: dense not found
Venkatesh Dharavath
  • 500
  • 1
  • 5
  • 18

1 Answers1

0

Well, after the reopening of this question, I could able to post a workaround for this question. For some reason .dense() method is not working and instead of that I used .toarray() which worked perfectly for me.

happy modeling.

Venkatesh Dharavath
  • 500
  • 1
  • 5
  • 18