1

I'm starting doing text mining in R and I've some problems. I have a csv with users comments about a page. Each row is a different comment. It only has 1 column, the one that has the comments. I was trying to use Tidy in R so I import the file (read.csv) and I get a data frame with n factor levels. The next step is try to tokenize the rows

The csv looks like this

enter image description here

#load the data
prueba <- read.csv(file="C:/Users/Mr & Mrs Bean/Downloads/Prueba.csv", sep=";")
#trying to tokenize
ty_prueba <- tidy(prueba)
Error in UseMethod("tidy") : 
  no applicable method for 'tidy' applied to an object of class "factor"

As you can see, I get that error. I've also try to convert to character that column but I get the same error. Every example I look has a text prepare to work, so it's difficult to see how the raw texts are prepared. It's a rookie problem, so any advice will be appreciated.

Pablo
  • 140
  • 1
  • 11

1 Answers1

0

I have found a solution. As someone post here now I've used read_excel (library readxl) instead of read.csv. It works for me. I suppose that it's something related to how R reads the file.

Pablo
  • 140
  • 1
  • 11
  • `read.csv` reads character data as a factor by default, `read_csv` does not. Disable by using `stringsAsFactors = FALSE` in the `read.csv` call. – phiver Mar 22 '20 at 08:57