I'm analysing a column with words in my most_used_words
dataframe. With 2180 words.
most_used_words
word times_used
<chr> <int>
1 people 70
2 news 69
3 fake 68
4 country 54
5 media 44
6 u.s 42
7 election 40
8 jobs 37
9 bad 36
10 democrats 35
# ... with 2,170 more rows
When I inner_join
with the AFINN lexicon only 364 of the 2180 words are scored. Is this because the words in the in the AFINN lexicon don't appear in my dataframe? I'm affraid if that's the case this may introduce bias in my analysis. Should I use a different lexicon? Is there something else that's happening?
library(tidytext)
library(tidyverse)
afinn <- get_sentiments("afinn")
most_used_words %>%
inner_join(afinn)
word times_used score
<chr> <int> <int>
1 fake 68 -3
2 bad 36 -3
3 win 24 4
4 failing 21 -2
5 hard 20 -1
6 united 19 1
7 illegal 17 -3
8 cuts 15 -1
9 badly 13 -3
10 strange 13 -1
# ... with 354 more rows