Python 3.9.5: One dictionary assignment is overwriting multiple keys [BUG?]

Question

I am reading a .csv called courses. Each row corresponds to a course which has an id, a name, and a teacher. They are to be stored in a Dict. An example:

list_courses = { 
    1: {'id': 1, 'name': 'Biology', 'teacher': 'Mr. D'},
    ... 
 }

While iterating the rows using enumerate(file_csv.readlines()) I am performing the following:

list_courses={}

for idx, row in enumerate(file_csv.readlines()):
                # Skip blank rows.
                if row.isspace(): continue
                
                # If we're using the row, turn it into a list.
                row = row.strip().split(",")

                # If it's the header row, take note of the header. Use these values for the dictionaries' keys.
                # As of 3.7 a Dict remembers the order in which the keys were inserted.
                # Since the order is constant, simply load each other row into the corresponding key.           
                if not idx: 
                    sheet_item = dict.fromkeys(row)
                    continue
                
                # Loop through the keys in sheet_item. Assign the value found in the row, converting to int where necessary.
                for idx, key in enumerate(list(sheet_item)):
                    sheet_item[key] = int(row[idx].strip()) if key == 'id' or key == 'mark' else row[idx].strip()


                # Course list
                print("ADDING COURSE WITH ID {} TO THE DICTIONARY:".format(sheet_item['id']))
                list_courses[sheet_item['id']] = sheet_item
                print("\tADDED: {}".format(sheet_item))
                print("\tDICT : {}".format(list_courses))

Thus, the list_courses dictionary is printed after each sheet_item is added to it.

Now comes the issue - when reading in two courses, I expect that list_courses should read:

list_courses = { 
    1: {'id': 1, 'name': 'Biology', 'teacher': 'Mr. D'},
    2: {'id': 2, 'name': 'History', 'teacher': 'Mrs. P'}
 }

However, the output of my print statements (substantiated by errors later in my program) is:

ADDING COURSE WITH ID 1 TO THE DICTIONARY:
        ADDED: {'id': 1, 'name': 'Biology', 'teacher': 'Mr. D'}
        DICT : {1: {'id': 1, 'name': 'Biology', 'teacher': 'Mr. D'}}
ADDING COURSE WITH ID 2 TO THE DICTIONARY:
        ADDED: {'id': 2, 'name': 'History', 'teacher': 'Mrs. P'}
        DICT : {1: {'id': 2, 'name': 'History', 'teacher': 'Mrs. P'}, 2: {'id': 2, 'name': 'History', 'teacher': 'Mrs. P'}}

Thus, the id with which the sheet_item is being added to courses_list is correct (1 or 2), however the assignment which occurs for the second course appears to be overwriting the value for key 1. I'm not even sure how this is possible. Please let me know your thoughts.

I might be missing it, where does `list_courses` get created/instantiated? — Daniel Butler, May 23 '21 at 14:40
@DanielButler I omitted it, but I'll add it - it's simply `list_courses = {}` before the loop. — KuboMD, May 23 '21 at 14:42
The part of the code that looks suspect is where it loops through the keys in sheet item. — Daniel Butler, May 23 '21 at 14:51
@DanielButler I don't believe so. When I output statements such as `key` and `row[idx]` the output is expected. e.g. "id <- 1" and "id <- 2" — KuboMD, May 23 '21 at 14:56

score 1 · Accepted Answer · answered May 23 '21 at 15:33

You're using the same dictionary for both the header and all the rows. You never create any new dictionaries after the header. Key assignments are overwriting previous ones, because there are no new dictionaries to write to.

Store the keys in a list, and make a new sheet_item before the for loop:

list_courses={}
keys = None # Let Python know this is defined

for idx, row in enumerate(file_csv.readlines()):
                # Skip blank rows.
                if row.isspace(): continue
                
                # If we're using the row, turn it into a list.
                row = row.strip().split(",")

                # If it's the header row, take note of the header. Use these values for the dictionaries' keys.
                # As of 3.7 a Dict remembers the order in which the keys were inserted.
                # Since the order is constant, simply load each other row into the corresponding key.           
                if not idx: 
                    keys = row
                    continue
                
                sheet_item = {}
                # Loop through the keys in sheet_item. Assign the value found in the row, converting to int where necessary.
                for idx, key in enumerate(keys):
                    sheet_item[key] = int(row[idx].strip()) if key == 'id' or key == 'mark' else row[idx].strip()


                # Course list
                print("ADDING COURSE WITH ID {} TO THE DICTIONARY:".format(sheet_item['id']))
                list_courses[sheet_item['id']] = sheet_item
                print("\tADDED: {}".format(sheet_item))
                print("\tDICT : {}".format(list_courses))

Hey Luther, thanks a lot, this works. I don't understand why, though. What's the difference between using `for Key in (list(sheet_item))` and using `for Key in ['id', 'name', 'teacher']`? — KuboMD, May 23 '21 at 17:38
`sheet_item` represents an individual record and must be re-initialized on every iteration. If it's re-initialized, it can't already know what keys it will have. The keys have to be stored in their own variable on the first iteration, when you have the header but no record data. — luther, May 24 '21 at 01:09
Finding a Python bug in anything as important as dictionary is incredibly unlikely - always suspect your own code first, second and third. — DisappointedByUnaccountableMod, May 24 '21 at 09:45

Python 3.9.5: One dictionary assignment is overwriting multiple keys [BUG?]

1 Answers1