2

I have files that look like this:

|2000|,|23456745|,|23567897tyhgy6|,|SHARP, RODNEY H III|
|2000|,|12345678|,|34567tgh788877|,|WOOLARD, EDGAR S JR|

Basically, the columns are separated by commas and wrapped by pipes.

How do I read something like this using R?

I have tried

read.table("file.txt", sep="|")

but this doesn't work well, since every other column just contains a comma. I have tried using "|,|" as the separator, but apparently this is not allowed. Using "," doesn't work at all since the names then get split up.

Any easy way to do this?

Atom Vayalinkal
  • 2,642
  • 7
  • 29
  • 37

2 Answers2

2

read.table("./temp.csv", sep=",", quote = "|") will do the trick...

Wimpel
  • 26,031
  • 1
  • 20
  • 37
1

You can just try to replace it with other seperator:

plouf <-   readChar("file.txt", file.info("file.txt")$size)
plouf <- gsub("\\|,\\|",";",plouf) # replace the separator
plouf <- gsub("\\|","",plouf) # remove the end pipes
read.table(plouf,sep=";") # read with the semi colon sep

A test:

plouf <- "|2000|,|23456745|,|23567897tyhgy6|,|SHARP, RODNEY H III|
          |2000|,|12345678|,|34567tgh788877|,|WOOLARD, EDGAR S JR|"

plouf <- gsub("\\|,\\|",";",plouf)
plouf <- gsub("\\|","",plouf)
read.table(text = plouf,sep=";")

    V1       V2             V3                  V4
1 2000 23456745 23567897tyhgy6 SHARP, RODNEY H III
2 2000 12345678 34567tgh788877 WOOLARD, EDGAR S JR
denis
  • 5,580
  • 1
  • 13
  • 40