1

I try to import a txt/csv file with R, and here is a simplified example:

txt <- "
id,value
001,'Mary'
002,'Mary's, husband'"

For the second line, I would like to consider the value of the variable value as Mary's, husband delimited with ' and there is also ' after Mary.

We can use ignore the quotes and clean after importing. But since , is also the separator, there is a bug when importing as follows:

df1 <- read.csv(textConnection(txt))
df1$value <- gsub("^'|'$", "", df1$value)
df1

My question is how to import the data correctly in this case, to have Mary's, husband as a single value.

[Update]: as one comment points out, this file is not correctly buit. But we can consider that there are always two columns, and the first , is the separator for one row.

Mary Smith
  • 33
  • 4
  • df <- read.csv(text = txt, quote = "'") – danh Mar 01 '22 at 17:18
  • Thank you danh, I update my question. I forgot the situation is a little more complicated. – Mary Smith Mar 01 '22 at 17:23
  • 2
    Well, the problem is that's an invalid CSV file. If you have quotes in quotes those should be escaped. Where did this CSV file come from? It would be better to fix the mistake at the time the file is made rather than clean up the mess later. – MrFlick Mar 01 '22 at 17:26
  • Using `read_csv` from the tidyverse will handle this automatically, but it will remove the "'" inside of Mary's, if that is acceptable: `df <- read_csv(txt, quote = "'")` – danh Mar 01 '22 at 17:32

0 Answers0