0

I have a .csv file that contains a situation like this (additional spaces added for readability):

1, 3 , "string" ,  "string4"     , NA
2, 5 , "string" , "s\"tring\"4"  , 3
1, 3 , "string" , "stri,ng4"     , 5
8, 7 , "string" , "st\"ri,n\"g4" , 5

I am reading this into RStudio on a Windows 10 machine, using the following statement:

read.table("file_name.csv",fill=TRUE, header=FALSE, quote="\"", sep=",", encoding="UTF-8")  

With the following response:

   V1 V2     V3           V4     V5 V6
 1  1  3 string      string4   <NA> NA
 2  2  5 string  s\\tring\\4      3 NA
 3  1  3 string     stri,ng4      5 NA
 4  8  7 string       st\\ri  n\\g4  5

The problem seems to be that the comma within the escaped quotes in row 5, it is being interpreted as a separator.

I am expecting/looking for something like following, but I'm not sure how to get it.

   V1 V2     V3            V4    V5 
 1  1  3 string       string4  <NA>
 2  2  5 string   s\"tring\"4     3
 3  1  3 string      stri,ng4     5
 4  8  7 string  st\"ri,n\"g4     5

I considering reprocessing the file using grep to change \" to ', but I'm curious if there is a more direct method. It seems like a potentially common issue, but I can't find a good example a solution.

Thoughts, anyone?

1 Answers1

-1

Try using read.table("file_name.csv"). It gave me the desired output.

Fabio Marroni
  • 423
  • 8
  • 19