I have a bunch of files which I'm merging in one data frame. The file names are as such: unc.edu.b6530750-0410-43ec-bb79-f862ca3424a6.1918120.rsem.genes.results
And I want the file names to be the column names. I'm using the following code:
for (file in file_list){
if (!exists("dataset")){
dataset <- read.table(file, header=TRUE, colClasses = c(rep("character", 2), rep("NULL", 2)), col.names = c("gene_id", deparse(substitute(file)), "NuLL", "NULL"), sep="\t")
print(deparse(substitute(file)))
}
if (exists("dataset")){
temp_dataset <-read.table(file, header=TRUE, colClasses = c(rep("character", 2), rep("NULL", 2)), col.names = c("gene_id", deparse(substitute(file)), "NuLL", "NULL"), sep="\t")
print(deparse(substitute(file)))
dataset<-merge(dataset, temp_dataset, by = "gene_id")
rm(temp_dataset)
}
}
All goes well except that the column names now have underscores replaced by dots.
colnames(data)
[1] "gene_id"
[2] "X...unc.edu.02cb8dbe.ef56.471c.b52d.41c29219fd95.1794854.rsem.genes.results..x"
[3] "X...unc.edu.02cb8dbe.ef56.471c.b52d.41c29219fd95.1794854.rsem.genes.results..y"
[4] "X...unc.edu.02f5dcba.bdcc.4424.aed4.195a8d551325.2085643.rsem.genes.results."
Any explanation as to what causes this would be helpful because I will need to change these names, using another file, later on.