I have created an object called Issuer
, which contains a member named issuer_name
.
I want to take advantage of fuzzywuzzy's process.extract()
function, but it only takes in a list of strings. My goal is to find matches and return the list of objects that match by the issuer_name
.
I came up with this method below, but it's running really slow. The issuers list contains over 100,000 elements.
# (string, list of issuers , integer)
def fuzzyMatchWordToIssuers(word, issuers, threshold):
limit = 5
count = 0
res = []
for issuer in issuers:
calc = fuzz.token_set_ratio(word,issuer.issuer_name)
if calc >= threshold:
res.append(issuer)
count += 1
if count == limit:
return res
return res
Is it possible to use the process.extract()
somehow, or speed this up?
For reference, here's the github example:
process.extract("new york jets", choices, limit=2)