0

I am new to spark and trying to run in count vectorizer using koalas data frame but getting error over this code. Koalas uses Pandas API, so I tried to run this count vectorizer code but got an error - 'Series' object has no attribute '_jdf'.

vectorizer = CountVectorizer()
x=vectorizer.fit(mcp_demo_inner["final_text"])
X = x.transform(mcp_demo_inner["final_text"])
# print(vectorizer.get_feature_names())
counts = ks.DataFrame(X.toarray(),
                      index=mcp_demo_inner.doc_num,
                      columns=vectorizer.get_feature_names())

0 Answers0