I am trying to test the performance of the Lexicon-Analyzer from the nltk module to compare it with different models.
I have the following csv which I put in a Pandas Dataframe, it looks as follows:
Sentiment Text
0 1 "When will the suffering end?"\r\nSubaru enjoy...
1 1 ~This is a preliminary review. As of writing t...
2 1 As of the time I'm writing this review I've on...
3 0 Fine then. I'll be the first one to say it. Th...
4 1 here are two responses I get, without fail, ev
Pandas Code:
anime = pd.read_csv("Models/Benchmarks/AnimeReviews.csv", encoding="utf-8")
anime.shape
anime.dtypes
anime.convert_dtypes().dtypes
anime.dtypes
x_anime = anime.iloc[:, 1]
y_anime = anime.iloc[:,0]
sentiment is returned as int64 and the sentences as an object
Using this, I want to create an array to save all the results to then use sklearn.metrics.accuracy_score to compare the results and get the accuracy.
I did the following:
results = np.array(object)
for sentence in x_anime:
result = demo_liu_hu_lexicon(sentence)
np.append(results, result)
If I do this, the array is shown as empty. I tried it with just one sentence (no for loop) and it saved the result in the array. That means the sentence doesn't get passed through to the analyzer to analyze it.
How can I solve this?