0

I'm a beginner and I need help with my code.

I have two files in list format that I would like to use as dictionaries. Both files are in the same format. Column 1 has the key and column 2 has the associated values separated by "|". All keys may not be present in both files.

Example: File1.txt

1   b|c    
2   a|b|d    
3   a|b|c    
4   a|b    
9   a|b|c

File2.txt

1   a|c  
2   a|b|c|d|e    
3   a|b|c    
6   a|b|c    
7   a|b|c    
8   a|b|c    
9   x|y

I would like to make a file File3.txt that has the common values per key, with each key represented. A blank cell for when the key is not common in both lists and 'no matches' where the key is common but no common values shared. (The last part is an afterthought so it doesn't show up in my code below.)

example: File3.txt

1   c    
2   a|b|d    
3   a|b|c    
4       
6       
7       
8       
9   no matches

The following is the code I've written so far. I think I'm totally off but would appreciate any help. Thank you!

#!/usr/bin/env python

table = {}
ref2gene = {}
table2 = {}
ref2gene2 = {}
with open('File1.txt') as f_in:  
    for line in f_in:
        row = line.strip()
        table[line.split('\t')[0]] = line.split('\t')[1]
        gene_name = row[0]        
        for ref in row[1].split('|'):
            ref2gene[ref] = gene_name

with open('File2.txt') as f_1, open('File3.txt', 'w') as f_2:   
    for line in f_1:
        row2 = line.strip()
        table2[line.split('\t')[0]] = line.split('\t')[1]        
        gene_name2 = row2[0]        
        for ref2 in row2[1].split('|'):
            ref2gene2[ref2] = gene_name2

def intersect_two_dicts (table, table2):
    return { k:v for k,v in table.iteritems() if ((k in table2)and(table[k]==table2[k])) }        

print (intersect_two_dicts(dicts[0], dicts[1]))     
AstroCB
  • 12,337
  • 20
  • 57
  • 73
rod12160
  • 5
  • 4

1 Answers1

0

Try this way using dictonary we can solve this just replace print with file write

file1=open('a.txt','r')
file1=file1.readlines()
file1={i.split()[0]:i.split()[1] for i in file1}
print file1
#{'1': 'a|c', '3': 'a|b|c', '2': 'a|b|c|d|e', '7': 'a|b|c', '6': 'a|b|c', '9': 'x|y', '8': 'a|b|c'}


file2=open('b.txt','r')
file2=file2.readlines()
file2={i.split()[0]:i.split()[1] for i in file2}
print file2
#{'1': 'b|c', '9': 'a|b|c', '3': 'a|b|c', '2': 'a|b|d', '4': 'a|b'}

keys=set(file1.keys()+file2.keys())

for i in sorted(keys):
    if i in file1 and i in file2:
        file1_values=file1[i].split('|')
        file2_values=file2[i].split('|')
        intersec=set(file1_values)&set(file2_values)
        if len(intersec)>0:
            print i,'|'.join(intersec)
        else:print i, 'missing values'
    else:
        print i,'empty'

total output

1 c
2 a|b|d
3 a|c|b
4 empty
6 empty
7 empty
8 empty
9 missing values
sundar nataraj
  • 8,524
  • 2
  • 34
  • 46
  • that works! Thank you so much! I would give you a +1 but I can't yet...have to wait till 15reps. I really appreciate your help :) – rod12160 Aug 21 '14 at 02:30