I have tried to read the following file in R using read.csv and it seems to me that whenever the first line of the file doesn't contain the largest number of columns, read.csv reads it incorrectly. Specifically, when I put the record "CCW12 ERV14 PER1 PTK2 RPN4 SEC66 SKY1 SUR4 VPS51 VPS52 VPS53 VPS54 VTC4" in the first line of my file then read.csv reads the file correctly into a 7-row table.
My file:
CCW12 ERV14 PER1 PTK2 RPN4 SEC66 SKY1 SUR4 VPS51 VPS52 VPS53 VPS54 VTC4
ERV14 HLJ1 ILM1 KRE1 PER1
BST1 ERV14 ERV25 HLJ1 KIN3 KRE1 LAS21 PER1 VPS38
ANP1 CWH43 ERV14 HLJ1 LAS21 PER1 SUR4 VPS51
CCW12 ERD1 ERV14 OST3 PER1 PMT2 SUM1 SUR4 TED1
ERV14 PER1 SEC66 SSH1 SUR4 VPS51
CCW12 PER1 PMT2 RPN4 SKY1 SUR4 TED1
y=read.csv("./file.txt", sep=" ", header=FALSE)
y
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13
1 CCW12 ERV14 PER1 PTK2 RPN4 SEC66 SKY1 SUR4 VPS51 VPS52 VPS53 VPS54 VTC4
2 ERV14 HLJ1 ILM1 KRE1 PER1
3 BST1 ERV14 ERV25 HLJ1 KIN3 KRE1 LAS21 PER1 VPS38
4 ANP1 CWH43 ERV14 HLJ1 LAS21 PER1 SUR4 VPS51
5 CCW12 ERD1 ERV14 OST3 PER1 PMT2 SUM1 SUR4 TED1
6 ERV14 PER1 SEC66 SSH1 SUR4 VPS51
7 CCW12 PER1 PMT2 RPN4 SKY1 SUR4 TED1
But when I put that record in some other place, then read.csv puts breaks that record into two rows one of which contains the items {CCW12 ERV14 PER1 PTK2 RPN4 SEC66 SKY1 SUR4 VPS51}
and the other contains {VPS52 VPS53 VPS54 VTC4}
.
My file after I moved the first line to another place:
ERV14 HLJ1 ILM1 KRE1 PER1
BST1 ERV14 ERV25 HLJ1 KIN3 KRE1 LAS21 PER1 VPS38
ANP1 CWH43 ERV14 HLJ1 LAS21 PER1 SUR4 VPS51
CCW12 ERD1 ERV14 OST3 PER1 PMT2 SUM1 SUR4 TED1
ERV14 PER1 SEC66 SSH1 SUR4 VPS51
CCW12 ERV14 PER1 PTK2 RPN4 SEC66 SKY1 SUR4 VPS51 VPS52 VPS53 VPS54 VTC4
CCW12 PER1 PMT2 RPN4 SKY1 SUR4 TED1
y=read.csv("./file.txt", sep=" ", header=FALSE)
y
V1 V2 V3 V4 V5 V6 V7 V8 V9
1 ERV14 HLJ1 ILM1 KRE1 PER1
2 BST1 ERV14 ERV25 HLJ1 KIN3 KRE1 LAS21 PER1 VPS38
3 ANP1 CWH43 ERV14 HLJ1 LAS21 PER1 SUR4 VPS51
4 CCW12 ERD1 ERV14 OST3 PER1 PMT2 SUM1 SUR4 TED1
5 ERV14 PER1 SEC66 SSH1 SUR4 VPS51
6 CCW12 ERV14 PER1 PTK2 RPN4 SEC66 SKY1 SUR4 VPS51
7 VPS52 VPS53 VPS54 VTC4
8 CCW12 PER1 PMT2 RPN4 SKY1 SUR4 TED1
I have checked with vim that there is no invisible/wired character in my file other than the spaces between two items in a record/line and end-of-line characters at the end of lines. So am I doing something wrong or is it an R problem?
I have seen one post that raises the same issue but couldn't find much help from there.