My requirement is to find matching names for 2 list. One list has 400 names and second list has 90000 names. I got the desired result but process takes more than 35 mins. As it is obvious , there are 2 for loops so it takes O(N*N) operations which is the bottleneck. I have removed the duplicates in both the lists . Can you help improve it. I checked many other questions but somehow couldn't get that implemented. If you think I just missed reading some already existing post , please do point to that. I will try my best to understand and replicate that.
Below is my code
from fuzzywuzzy import fuzz
infile=open('names.txt','r')
name=infile.readline()
name_list=[]
while name:
name_list.append(name.strip())
name=infile.readline()
print (name_list)
infile2=open('names2.txt','r')
name2=infile2.readline()
name_list2=[]
while name2:
name_list2.append(name2.strip())
name2=infile2.readline()
print (name_list2)
response = {}
for name_to_find in name_list:
for name_master in name_list2:
if fuzz.ratio(name_to_find,name_master) > 90:
response[name_to_find] = name_master
break
for key, value in response.items():
print ("Key is ->" + key + " Value is -> " + value)