I have multiple text files in a folder that I'm trying to read and write into a dictionary. The files look like this:
file1.txt:
chr17 1 1 T C C 5
chr13 2 2 A A G 4
file2.txt:
chr17 1 1 T C C 5
chr17 2 2 A A G 4
Code:
import os,csv, glob
mydict = {}
for file in glob.glob(os.path.join(os.getcwd(), '*.txt')):
with open(file) as f:
for line in f:
mydict[",".join(line.split()[0:4])] = ",".join(line.split()[4:6])
for (key,val) in mydict.items():
print file, key, val
Expecting it to print all the four rows in the two files with first four columns as key and 5,6 columns as value:
file1.txt chr17,1,1,T C,C
file1.txt chr13,2,2,A A,G
file2.txt chr17,1,1,T C,C
file2.txt chr17,2,2,A A,G
But getting this, instead:
file1.txt chr17,1,1,T C,C
file1.txt chr13,2,2,A A,G
file2.txt chr17,1,1,T C,C
file2.txt chr13,2,2,A A,G (extra row!!! This row's in file1, but not file2)
file2.txt chr17,2,2,A A,G