1

I trying to explore different classifier for this example in scikit-learn website http://scikit-learn.org/stable/tutorial/text_analytics/working_with_text_data.html. However, the code below produced an error: ValueError: setting an array element with a sequence.

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfTransformer
import tensorflow.contrib.learn as skflow

data = ["I so handsome. I just broke the mirror!","I am a normal guy."]
label = np.array([0,1])

#CountVectoriser
count_vect = CountVectorizer()
X_train_counts = count_vect.fit_transform(data)

#TfidfTransformer
tfidf_transformer = TfidfTransformer()
X_train_tfidf = tfidf_transformer.fit_transform(X_train_counts)

#Classifier
clf = skflow.TensorFlowLinearClassifier(n_classes=2)
clf.fit(X_train_tfidf, label)
edwin
  • 1,152
  • 1
  • 13
  • 27

1 Answers1

2

The TensorFlowLinearClassifier does not handle CSR matrix as input, you can follow the progress in that issue.


What you can do for now is convert X_train_tfidf to a numpy matrix before feeding it to clf.fit():

clf.fit(X_train_tfidf.toarray(), label)
Olivier Moindrot
  • 27,908
  • 11
  • 92
  • 91