2

I want to read data from a text file into an R dataframe. The data is delimited by pipes | and also has quotes around the values. I've tried some combinations of read.table but it's importing everything into a single field as opposed to splitting it. The data looks like this:

"CompetitorDataID"|"CompetitorID"|"ItemID"|"UserID"|"CountryID"|"SegmentID"|"TaskID"|"Price"|"Comment"|"CreateDate"|"GeneralCustomer"|"TenderResult"
"29"|"5"|"187630"|"1375"|"5"|"398"|"4085"|"5.000000"|"test"|"2013-01-1002:58:23.230000000"|"False"|"1"
"30"|"5"|"1341"|"1294"|"5"|"398"|"4088"|"6.000000"|"test"|"2013-01-1003:15:26.687000000"|"False"|"1"
"31"|"5"|"1007"|"1375"|"5"|"398"|"4105"|"5.000000"|""|"2013-01-1005:50:51.150000000"|"False"|"1"

Although this code will import when pasted into R it won't work from the original text file. I get the following error message:

Warning messages:
1: In read.table("competitorDataCopy.txt", header = TRUE, sep = "|") :
  line 1 appears to contain embedded nulls
2: In read.table("competitorDataCopy.txt", header = TRUE, sep = "|") :
  line 2 appears to contain embedded nulls
3: In read.table("competitorDataCopy.txt", header = TRUE, sep = "|") :
  line 3 appears to contain embedded nulls
4: In read.table("competitorDataCopy.txt", header = TRUE, sep = "|") :
  line 4 appears to contain embedded nulls
5: In read.table("competitorDataCopy.txt", header = TRUE, sep = "|") :
  line 5 appears to contain embedded nulls
6: In read.table("competitorDataCopy.txt", header = TRUE, sep = "|") :
  line 1 appears to contain embedded nulls
7: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  :
  embedded nul(s) found in input
user3302483
  • 845
  • 4
  • 12
  • 20
  • 6
    Setting `sep="|"` seems to work for me. `read.table(text='"CompetitorDataID"|"CompetitorID"|"ItemID"|"UserID"|"CountryID"|"SegmentID"|"TaskID"|"Price"|"Comment"|"CreateDate"|"GeneralCustomer"|"TenderResult" "29"|"5"|"187630"|"1375"|"5"|"398"|"4085"|"5.000000"|"test"|"2013-01-10 02:58:23.230000000"|"False"|"1" "30"|"5"|"1341"|"1294"|"5"|"398"|"4088"|"6.000000"|"test"|"2013-01-10 03:15:26.687000000"|"False"|"1" "31"|"5"|"1007"|"1375"|"5"|"398"|"4105"|"5.000000"|""|"2013-01-10 05:50:51.150000000"|"False"|"1"', sep="|", header=T)`. – MrFlick Nov 10 '14 at 18:53
  • 2
    @MrFlick Make it an answer. – Thomas Nov 10 '14 at 19:44
  • Your actual issue was reading a file with defective encoding under Windows, please edit the question accordingly. This is not about reading quoted PSV files. This will mislead other users. – smci Jul 03 '18 at 16:45

3 Answers3

2

You can easily import a pipe delimited .txt file this way:

file_in <- read.table("C:/example.txt", sep = "|")

That applies for any character separated text files, just change the sep to suit.

JamesR
  • 613
  • 8
  • 15
0

Setting sep="|" seems to work for me. The default parameter for read.table is quote="\"" so it will automatically strip the quotes from the beginning/ending of values.

read.table(text='"CompetitorDataID"|"CompetitorID"|"ItemID"|"UserID"|"CountryID‌​
"|"SegmentID"|"TaskID"|"Price"|"Comment"|"CreateDate"|"GeneralCustomer"|"TenderRe‌​sult" 
"29"|"5"|"187630"|"1375"|"5"|"398"|"4085"|"5.000000"|"test"|"2013-01-10     02:58:23.230000000"|"False"|"1" 
"30"|"5"|"1341"|"1294"|"5"|"398"|"4088"|"6.000000"|"test"|"2013-01-10     03:15:26.687000000"|"False"|"1" 
"31"|"5"|"1007"|"1375"|"5"|"398"|"4105"|"5.000000"|""|"2013-01-10 05:50:51.150000000"|"False"|"1"'
, sep="|", header=T)
MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • This solution works but only on the pasted data on example, not the original text file. I've updated my question to try and be clearer as my original question wasn't detailed enough. – user3302483 Nov 11 '14 at 09:31
0

I have solved the issue by opening the file in notepad and changing the encoding from Unicode to ANSI. Not sure why this makes a difference but it imports cleanly now.

user3302483
  • 845
  • 4
  • 12
  • 20
  • This is not really a solution about reading PSV in R. If the actual issue was reading a file with defective encoding under Windows, then please edit the question accordingly. – smci Jul 03 '18 at 16:41