I currently have hundreds of files containing unique IDs and unnormalized read counts. I want to take the read counts from each file and match them all to the unique IDs in the first column. However, each files has a different amount of counts and different IDs that may or may not contain duplicates from the last file. (Basically I'm looking to make a counts file for DESeq2)
I was using the code below to combine these files but the counts don't match up with the original IDs.
My overall goal is to just take the unnormalized read counts from every file and match them a dataframe with the total list of corresponding unique IDs -- if the file does not have counts for that particular ID then it could just be filled with 0.
'''
DF = do.call(cbindX,
lapply( list.files(pattern=".*.txt"),
FUN=function(x) {
aColumn = read.delim(x,header=T)[,c("MINTbase.Unique.ID", "Unnormalized.read.counts")];
colnames(aColumn)[2] = x;
aColumn;
}
)
)
DF = DF[,!duplicated(colnames(DF))]
'''