1

I tried to parse this tsv file using reader::read_tsv but I keep getting error of parsing failures. Then I realised that the tsv contained some unusual characters, when I used python to read the file it says encoding='cp1252'

I have tried using these:

writeLines(iconv(readLines("Evaluations (1).tab"), from = "cp1252", to = "UTF8"), file("test2.tab", encoding="UTF-8"))

read.delim("Evaluations (1).tab", sep = "\t", encoding = "Windows-1252")

read.table("Evaluations (1).tab", header=TRUE, sep="\t", fileEncoding="CP1252")

none worked.

Can someone take a look at this tab file and guide me how I can parse this?

Thanks!!!

KKW
  • 367
  • 1
  • 11

1 Answers1

2

It seems it's UCS-2LE encoded so try:

read.table(file = "Evaluations (1).tab", sep = "\t", header = TRUE, fileEncoding = "UCS-2LE")

[1] Session.Date                 Date.Completed               Evaluator.Name               Evaluator.Status             Subject.Name                
 [6] Subject.Rotation             Overall.Comments             Subject.Comments             X.Question.1.ID.             X.Question.1.Tags.          
[11] X.Question.1.Response.       X.Question.1.Comment.        X.Question.1.Drop.Down.List. X.Question.2.ID.             X.Question.2.Tags.          
[16] X.Question.2.Response.       X.Question.2.Comment.        X.Question.2.Drop.Down.List. X.Question.3.ID.             X.Question.3.Tags.          
[21] X.Question.3.Response.       X.Question.3.Comment.        X.Question.3.Drop.Down.List. X.Question.4.ID.             X.Question.4.Tags.          
[26] X.Question.4.Response.       X.Question.4.Comment.        X.Question.4.Drop.Down.List. X.Question.5.ID.             X.Question.5.Tags.          
[31] X.Question.5.Response.       X.Question.5.Comment.        X.Question.5.Drop.Down.List. X.Question.6.ID.             X.Question.6.Tags.          
[36] X.Question.6.Response.       X.Question.6.Comment.        X.Question.6.Drop.Down.List. X.Question.7.ID.             X.Question.7.Tags.          
[41] X.Question.7.Response.       X.Question.7.Comment.        X.Question.7.Drop.Down.List.
<0 rows> (or 0-length row.names)
Ritchie Sacramento
  • 29,890
  • 4
  • 48
  • 56
  • Great. Thanks! That did work. Do you know if readr has similar function? – KKW Nov 20 '20 at 00:28
  • I don't think `readr` supports multibyte encoding yet so you need to stay with base R unless you re-encode the file. – Ritchie Sacramento Nov 20 '20 at 00:42
  • OK thanks for pointing that out. Can you also please include how to find out what fileencoding to use for the file? – KKW Nov 20 '20 at 12:10
  • I also found this helpful, just sharing to others (https://github.com/tidyverse/readr/issues/1107) – KKW Nov 20 '20 at 13:00