1

I'm working on text mining with some Freud books from the Gutenberg project. When I try to do a sentiment analysis, using following code:

library(dplyr)
library(tidytext)
library(gutenbergr)

freud_books <- gutenberg_download(c(14969, 15489, 34300, 35875, 35877, 38219, 41214), meta_fields = "title")

tidy_books <- freud_books %>%
  unnest_tokens(word, text)

f_sentiment <- tidy_books %>%
  inner_join(get_sentiments("bing"), by = "word") %>% 
  count(title, index = line %/% 80, sentiment) %>% 
  spread(sentiment, n, fill = 0) %>% 
  mutate(sentiment = positive - negative)

I get the error:

Error in mutate_impl(.data, dots) : Evaluation error: non-numeric argument to binary operator.

I can see that the problem is in the last block, in the count function. Any help with this?

phiver
  • 23,048
  • 14
  • 44
  • 56
  • Quick question, what is `index = line %/% 80` trying to achieve? when `inner_join`ed, there is no column called `line`. Why is it not `count(title, sentiment)`? – Kim May 03 '18 at 20:58

1 Answers1

2

you should mutate line to your data after using the inner_join function because it's not column of your data so if you need it you have to create it yourself

pay attention to the mutate(line = row_number()) part, you can modify it if you need another way of assigning line numbers and then you can use index = line %/% 80 in count

try this:

library(dplyr)
library(tidytext)
library(gutenbergr)

freud_books <- gutenberg_download(c(14969, 15489, 34300, 35875, 35877, 38219, 41214),
 meta_fields = "title")

tidy_books <- freud_books %>%
  unnest_tokens(word, text)

f_sentiment <- tidy_books %>%
  inner_join(get_sentiments("bing"), by = "word") %>% 
  mutate(line = row_number()) %>%
  count(title, index = line %/% 80, sentiment) %>% 
  spread(sentiment, n, fill = 0) %>% 
  mutate(sentiment = positive - negative)
Alaleh
  • 1,008
  • 15
  • 27
  • 1
    Thanks @Alaleh, works perfectly. I just needed to add the tidyr package. – Ricardo Silva May 05 '18 at 15:31
  • what does the "title" in count represent? Titles? I saw another example with "book" when things were grouped by book. What if we want to check all the text? Links appreciated. – wayneeusa Mar 21 '19 at 10:58