Count number of times a word appears (dplyr)

Question

Simple question here, perhaps a duplicate of this?

I'm trying to figure out how to count the number of times a word appears in a vector. I know I can count the number of rows a word appears in, as shown here:

temp <- tibble(idvar = 1:3, 
               response = (c("This sounds great",
                      "This is a great idea that sounds great",
                      "What a great idea")))
temp %>% count(grepl("great", response)) # lots of ways to do this line
# answer = 3

The answer in the code above is 3 since "great" appears in three rows. However, the word "great" appears 4 different times in the vector "response". How do I find that instead?

Are you planning to provide a specific word and get the number you want? Or you want to get that number for every word that appears in all sentences? — AntoniosK, Aug 29 '18 at 15:35
Just planning to provide a specific word and get the number. I can use `tidytext` unnest to split sentences into tokens and then count the words. (But if you have recommendations for a different way to do it, I'm all ears!) — Daniel, Aug 29 '18 at 15:40

score 3 · Answer 1 · answered Aug 29 '18 at 15:36

3

We could use str_count from stringr to get the number of instances having 'great' in each row and then get the sum of that count

library(tidyverse)
temp %>% 
   mutate(n = str_count(response, 'great')) %>%
   summarise(n = sum(n))
# A tibble: 1 x 1
#      n
#   <int>
#1     4

Or using regmatches/gregexpr from base R

sum(lengths(regmatches(temp$response, gregexpr('great', temp$response))))
#[1] 4

answered Aug 29 '18 at 15:36

akrun

874,273
37
540
662

1

Thanks for the addition of `base R` - that may actually be simpler in some of my use cases. – Daniel Aug 29 '18 at 15:46

score 2 · Answer 2 · answered Aug 29 '18 at 15:37

2

Off the top of my head, this should solve your problem:

library(tidyverse)
temp$response %>% 
  str_extract_all('great') %>%
  unlist %>%
  length

answered Aug 29 '18 at 15:37

Vlad C.

944
7
12

Count number of times a word appears (dplyr)

2 Answers2

Linked