How to add prefix/suffix on a repeatable dictionary key in Python

Question

Could you please suggest is there any way to keep all the repeatable (duplicate) keys by adding prefix or suffix. In the below example, the address key is duplicated 3 times. It may vary (1 to 3 times). I want to get the output as in the expected output with adding a suffix to make the key unique. Currently the update function is overwriting the key value.

list = ['name:John','age:25','Address:Chicago','Address:Phoenix','Address:Washington','email:John@email.com']
dic = {}
for i in list:
    j=i.split(':')
    dic.update({j[0]:j[1]})
print(dic)

Current output: {'name': 'John', 'age': '25', 'Address': 'Washington', 'email': 'John@email.com'}

Expected output: {'name': 'John', 'age': '25', 'Address1': 'Chicago', 'Address2': 'Phoenix', 'Address3': 'Washington', 'email': 'John@email.com'}

Tried the below:

list = ['name:John','age:25','Address:Chicago','Address:Phoenix','Address:Washington','email:John@email.com']
dic = {}
for i in list:
    j=i.split(':')
    dic.update({j[0]:j[1]})
print(dic)

Expected output: {'name': 'John', 'age': '25', 'Address1': 'Chicago', 'Address2': 'Phoenix', 'Address3': 'Washington', 'email': 'John@email.com'}

score 4 · Answer 1 · answered Aug 17 '23 at 16:33

Don't use list as a variable name. list is the name of a Python builtin class, and it is used in the following solution. I renamed your list variable l.

This solution consists of first building a multidict (using collections.defaultdict(list)) to store the multiple values:

import collections
d = collections.defaultdict(list)
for entry in l:
    key, value = entry.split(':', 2)
    d[key].append(value)

now d contains:

{'name': ['John'], 'age': ['25'], 'Address': ['Chicago', 'Phoenix', 'Washington'], 'email': ['John@email.com']}

then iterate the values of d, and if more than one, append a suffix:

output = {}
for key, values in d.items():
    if len(values) > 1:
        for i, value in enumerate(values):
            output[f'{key}{i+1}'] = value
    else:
        output[key] = values[0]

output:

{'name': 'John', 'age': '25', 'Address1': 'Chicago', 'Address2': 'Phoenix', 'Address3': 'Washington', 'email': 'John@email.com'}

This is what I would do as well. Note one can use `setdefault()` here and avoid the import by defining `d = {}` then `d.setdefault(key, []).append(value)`. Of course, I assume you know that and I'm just noting it for other future viewers — JonSG, Aug 17 '23 at 16:36
@JonSG: true, but I find `defaultdict` a tad more readable; and in this specific example, I felt that mentioning the `list` type was a valuable teaching example :-) — fferri, Aug 17 '23 at 16:55

score 3 · Answer 2 · answered Aug 17 '23 at 16:37

Another solution (this iterates over the list only once):

lst = [
    "name:John",
    "age:25",
    "Address:Chicago",
    "Address:Phoenix",
    "Address:Washington",
    "email:John@email.com",
]

cnts, out = {}, {}
for k, v in map(lambda s: s.split(":"), lst):
    c = cnts.get(k, 0)
    if c == 0:
        out[k] = v
    elif c == 1:
        out[f"{k}1"] = out.pop(k)
        out[f"{k}2"] = v
    else:
        out[f"{k}{c + 1}"] = v

    cnts[k] = c + 1

print(out)

Prints:

{
    "name": "John",
    "age": "25",
    "Address1": "Chicago",
    "Address2": "Phoenix",
    "Address3": "Washington",
    "email": "John@email.com",
}

score 1 · Accepted Answer · answered Aug 17 '23 at 16:27

1

You can use something like this:

list_ = ['name:John','age:25','Address:Chicago','Address:Phoenix','Address:Washington','email:John@email.com']

dic = {}
for i in list_:
    j = i.split(':')
    key_ = j[0]
    count = 0 # counts the number of duplicates
    while key_ in dic:
        count += 1
        key_ = j[0] + str(count)
    dic[key_] = j[1]

Output:

{'name': 'John',
 'age': '25',
 'Address': 'Chicago',
 'Address1': 'Phoenix',
 'Address2': 'Washington',
 'email': 'John@email.com'}

PS. don't use the python keyword list to name your variables as it overrides the type list

answered Aug 17 '23 at 16:27

Suraj Shourie

536
2
11

1

This keeps the duplicate keys, but the 1st one is not suffixed. – Swifty Aug 17 '23 at 16:39
Ooh I misread, didn't notice 1st one needs to be suffixed as well. – Suraj Shourie Aug 17 '23 at 16:41

Alain T. · Answer 4 · 2023-08-17T21:46:14.230

You could first separate values from keys in two lists, then make the keys list unique by adding suffixes and combine the unique keys with the values into a dictionary at the end:

data = ['name:John','age:25','Address:Chicago',
        'Address:Phoenix','Address:Washington','email:John@email.com']

keys,values = zip(*(s.split(":") for s in data))
keys        = [ k+str(keys[:i].count(k))*(keys.count(k)>1) 
                for i,k in enumerate(keys,1) ]
dic         = dict(zip(keys,values))

print(dic)

{'name': 'John', 
 'age': '25', 
 'Address1': 'Chicago', 
 'Address2': 'Phoenix', 
 'Address3': 'Washington', 
 'email': 'John@email.com'}

Note that this does not cover cases where the suffixed keys clash with original keys. For example: ["Address1:...","Address:...","Address:..."] would produce a duplicate "Address1" by adding a suffix to the "Address" key. If that situation could exist in your data, a different approach would be needed

Alternatively, you can use a dictionary to group values in lists associated with each key and then expand this group dictionary to produce distinct keys:

grp = dict()
grp.update( (k,grp.get(k,[])+[v]) for s in data for k,v in [s.split(":")] )
dic = { k+str(i or ''):v for k,g in grp.items() 
                         for i,v in enumerate(g,len(g)>1) }

print(dic)

{'name': 'John', 
 'age': '25', 
 'Address1': 'Chicago', 
 'Address2': 'Phoenix', 
 'Address3': 'Washington', 
 'email': 'John@email.com'}

Although grp itself may actually be easier to manipulate in subsequent code:

print(grp)

{'name':    ['John'], 
 'age':     ['25'], 
 'Address': ['Chicago', 'Phoenix', 'Washington'], 
 'email':   ['John@email.com']}

How to add prefix/suffix on a repeatable dictionary key in Python

4 Answers4