I have found an issue where R seems to interpret "T"
as TRUE
even while using all means to avoid doing so (at least according to this post).
Example data (saved as "test.txt"):
col1 col2
1 T
2 T
3 T
4 T
5 T
6 T
7 T
8 T
9 T
Example code:
read.table("test.txt", as.is=TRUE, header=TRUE,
stringsAsFactors=FALSE, colClasses=c(character()))
Produces:
col1 col2
1 1 TRUE
2 2 TRUE
3 3 TRUE
4 4 TRUE
5 5 TRUE
6 6 TRUE
7 7 TRUE
8 8 TRUE
9 9 TRUE
Only non-ideal solution I found was to set header=FALSE:
read.table("test.txt", as.is=TRUE, header=FALSE,
stringsAsFactors=FALSE,
colClasses=c(character()))
V1 V2
1 col1 col2
2 1 T
3 2 T
4 3 T
5 4 T
6 5 T
7 6 T
8 7 T
9 8 T
10 9 T
I realize this may seem somewhat contrived, but this edge case is genuine in that a human gene is named actually "T"
(!) with values in col1
being positions within that gene.
Thanks in advance for the help