1

I am trying to replicate an analysis using tidytext in R, except using a loop. The specific example comes from Julia Silge and David Robinson's Text Mining with R, a Tidy Approach. The context for it can be found here: https://www.tidytextmining.com/sentiment.html#sentiment-analysis-with-inner-join.

In the text, they give an example of how to do sentiment analysis using the NRC lexicon, which has eight different sentiments, including joy, anger, and anticipation. I'm not doing an analysis for a specific book like the example, so I commented out that line, and it still works:

nrc_list <- get_sentiments("nrc") %>% 
  filter(sentiment == "joy")

wordcount_joy <- wordcount %>%
# filter(book == "Emma") %>%
  inner_join(nrc_list) %>%
  count(word, sort = TRUE)

As I said before, this works. I now want to modify it to loop over all eight emotions, and save the results in a dataframe labeled with the emotion. How I tried to modify it:

emotion <- c('anger', 'disgust', 'joy', 'surprise', 'anticip', 'fear', 'sadness', 'trust')

for (i in emotion) {

nrc_list <- get_sentiments("nrc") %>% 
  filter(sentiment == "i")

wcount[[i]] <- wordcount  %>%
  inner_join(nrc_list) %>%
  count(word, sort = TRUE)

}

I get an "Error: object 'wcount' not found" message when I do this. I have googled this and it seems like the answers to this question is to use wcount[[i]] but clearly something is off when I tried adapting it. Do you have any suggestions?

  • sentiment == "i" is looking for a sentiment that has the string value 'i'. Can you try your code with sentiment == i. Also, is wcount already initiated as a list? It doesn't look like it is. You can initate it with wcount = list() before your for-loop – Brigitte Nov 04 '19 at 18:36
  • Thanks for the response. Removing the quotes doesn't change anything, but inserting wcount=list() like you said leads to 8 tibbles, one for each element i. Unfortunately, they're all empty, with no rows. In theory if I can populate those tibbles it will be fine, but it is a little different than what happened with the creation of the wordcount_joy dataframe in the Silge/Robinson book (and that I use here) that I can call up later. – Jonathan D. Nov 04 '19 at 18:48

1 Answers1

0

Code below will do the trick. Note that you are refering to wordcount in your loop and the example uses tidybooks. Code follows the steps as in the link to tidytextmining you are refering to.

library(janeaustenr)
library(dplyr)
library(stringr)
library(tidytext)

tidy_books <- austen_books() %>%
  group_by(book) %>%
  mutate(linenumber = row_number(),
         chapter = cumsum(str_detect(text, regex("^chapter [\\divxlc]", 
                                                 ignore_case = TRUE)))) %>%
  ungroup() %>%
  unnest_tokens(word, text)

emotion <- c('anger', 'disgust', 'joy', 'surprise', 'anticip', 'fear', 'sadness', 'trust')
# initialize list with the length of the emotion vector
wcount <- vector("list", length(emotion))
# name the list entries
names(wcount) <- emotion

# run loop
for (i in emotion) {
  nrc_list <- get_sentiments("nrc") %>% 
    filter(sentiment == i)

  wcount[[i]] <- tidy_books  %>%
    inner_join(nrc_list) %>%
    count(word, sort = TRUE)
}
phiver
  • 23,048
  • 14
  • 44
  • 56