Try, Except / If Statement Combination - Missing results

Question

I am comparing one list of universities with 12 other lists, finding fuzzy string matches and writing all results to a csv. I am not doing the fuzzy string match to one big list as I need to know what list the match came from. Example of the lists:

data = [[1-00000, "MIT"], [1-00001, "Stanford"] ,...]

Data1 = ['MASSACHUSETTS INSTITUTE OF TECHNOLOGY (MIT)'], ['STANFORD UNIVERSITY'],...

With StackOverflow's help I got as far as:

for uni in data:
    hit = process.extractOne(str(uni[1]), data10, scorer = fuzz.token_set_ratio, score_cutoff = 90)
    try:
        if float(hit[1]) >= 94:
            with open(filename, mode='a', newline="") as csv_file:
                fieldnames = ['bwbnr', 'uni_name', 'match', 'points']
                writer = csv.DictWriter(csv_file, fieldnames=fieldnames, delimiter=';')
                writer.writerow({'bwbnr': str(uni[0]), 'uni_name': str(uni[1]), 'match': str(hit), 'points': 10})

    except:
        hit1 = process.extractOne(str(uni[1]), data11, scorer = fuzz.token_set_ratio, score_cutoff = 90)
        try:
            if float(hit1[1]) >= 94:
                with open(filename, mode='a', newline="") as csv_file:
                      fieldnames = ['bwbnr', 'uni_name', 'match', 'points']
                      writer = csv.DictWriter(csv_file, fieldnames=fieldnames, delimiter=';')
                      writer.writerow({'bwbnr': str(uni[0]), 'uni_name': str(uni[1]), 'match': str(hit), 'points': 5})

Going down the 12 lists until the last excepts where I include those with scores lower than 94 and end with a "not found":

    except:
        hit12 = process.extractOne(str(uni[1]), data9, scorer = fuzz.token_set_ratio)
        try:
            if float(hit12[1]) < 94:
                with open(filename, mode='a', newline="") as csv_file:
                       fieldnames = ['bwbnr', 'uni_name', 'match', 'points']
                       writer = csv.DictWriter(csv_file, fieldnames=fieldnames, delimiter=';')
                       writer.writerow({'bwbnr': str(uni[0]), 'uni_name': str(uni[1]), 'match': str(hit), 'points': 3})
        except:
            with open(filename, mode='a', newline="") as csv_file:
                  fieldnames = ['bwbnr', 'uni_name', 'match', 'points']
                  writer = csv.DictWriter(csv_file, fieldnames=fieldnames, delimiter=';')
                  writer.writerow({'bwbnr': str(uni[0]), 'uni_name': str(uni[1]), 'match': str(hit), 'points': 3})

However, I am returned only 2854 results as opposed to the 3175 in my original list (which all need to be checked and written to the new csv).

When I throw all my lists together and do my extractOne I do get 3175 results:

scored_testdata = []
for uni in data:
     hit = process.extractOne(str(uni[1]), big_list, scorer = fuzzy.token_set_ratio, score_cutoff = 90)
     scored_testdata.append(hit)
print(len(scored_testdata))

What am I missing here? I get the feeling results returning "None" in the process.extractOne are being dropped for some reason. Any help would be much appreciated.

The full code can be found here.

please fix your indentation - your code is no [mcve] -it is hard to see what you do... — Patrick Artner, Jan 31 '19 at 15:40
Why do you write _nothing_ into your file for `if float(hit[1]) >= 94: ...` ? you only write empty strings in it... - why the try: except: at all? ... the code makes not much sense to me - sorry — Patrick Artner, Jan 31 '19 at 15:41
Please reduce your code to a meaningful [mcve] that contains demodata and replicates your problems. — Patrick Artner, Jan 31 '19 at 15:43
Apologies, I dropped what's written to the csv as it made the code messy and long, focused too much on the "minimal" requirement. Added a link to the full code. Editing the question now. Thanks! — Uralan, Jan 31 '19 at 15:49

score 0 · Accepted Answer · answered Feb 03 '19 at 15:33

The final try-except should have been one checking all the lists and doing an extractBest without score_cutoff:

except:
    hit12 = process.extractOne(str(uni[1]), big_list, scorer = fuzz.token_set_ratio)
    with open(filename, mode='a', newline="") as csv_file:
           fieldnames = ['bwbnr', 'uni_name', 'match', 'confidence', 'points']
           writer = csv.DictWriter(csv_file, fieldnames=fieldnames, delimiter=';')
           writer.writerow({'bwbnr': str(uni[0]), 'uni_name': str(uni[1]), 'match': "CHECK AGAIN " + str(hit12[0]), 'confidence': str(hit12[1]), 'points': 3})

Try, Except / If Statement Combination - Missing results

1 Answers1