0

I have a CSV file of data that contain phrases like :

dd<-c("hello how are you?";"I am fine"; "hello how are you?"; "not too bad")

I want to get the frequency of each block of sentences (divided by ;) using wordcloud. However, what I get is the frequency per word.

Is there a way to get the frequency per block of content in each cell?

In this toy example I would get:

Text                   Freq 
----------------------------
hello how are you?     2

I am fine              1

not too bad            1

Thank you very much in advance

Jaap
  • 81,064
  • 34
  • 182
  • 193
user11444
  • 1
  • 1

1 Answers1

0

FWIW, try this

library(wordcloud)
library(tm)
txt <- c("hello how are you? I am fine", "hello how are you?; not too bad")
semicolonTonekizer <- function(x) unlist(strsplit(as.character(x), ";", fixed = TRUE))
tdm <- TermDocumentMatrix(Corpus(VectorSource(txt)), list(tokenize = semicolonTonekizer))
tab <- rowSums(as.matrix(tdm))
wordcloud(names(tab), tab)
lukeA
  • 53,097
  • 5
  • 97
  • 100