I have multiple tab files with same name in different folders like this
F:/RNASEQ2019/ballgown/abundance_est/RBRN02.sorted.bam\t_data.ctab
F:/RNASEQ2019/ballgown/abundance_est/RBRN151.sorted.bam\t_data.ctab
Each file have 5-6 common columns and I want to pick up two columns- Gene and FPKM. Gene column is same for all only FPKM value differ. I want to pickup Gene and FPKM column form each file and make a master file like this
Gene RBRN02 RBRN03 RBRN151
gene1 67 699 88
gene2 66 77 89
I did this
import os
path ="F:/RNASEQ2019/ballgown/abundance_est/"
files =[]
## r=root, d=directory , f=file
for r, d, f in os.walk(path):
for file in f:
if 't_data.ctab' in file:
files.append(os.path.join(r, file))
df=[]
for f in files:
df.append(pd.read_csv(f, sep="\t"))
But this is not doing side wise merge. How do I get that above format? please help