5

Could you please suggest is there any way to keep all the repeatable (duplicate) keys by adding prefix or suffix. In the below example, the address key is duplicated 3 times. It may vary (1 to 3 times). I want to get the output as in the expected output with adding a suffix to make the key unique. Currently the update function is overwriting the key value.

list = ['name:John','age:25','Address:Chicago','Address:Phoenix','Address:Washington','email:John@email.com']
dic = {}
for i in list:
    j=i.split(':')
    dic.update({j[0]:j[1]})
print(dic)

Current output: {'name': 'John', 'age': '25', 'Address': 'Washington', 'email': 'John@email.com'}

Expected output: {'name': 'John', 'age': '25', 'Address1': 'Chicago', 'Address2': 'Phoenix', 'Address3': 'Washington', 'email': 'John@email.com'}

Tried the below:

list = ['name:John','age:25','Address:Chicago','Address:Phoenix','Address:Washington','email:John@email.com']
dic = {}
for i in list:
    j=i.split(':')
    dic.update({j[0]:j[1]})
print(dic)

Expected output: {'name': 'John', 'age': '25', 'Address1': 'Chicago', 'Address2': 'Phoenix', 'Address3': 'Washington', 'email': 'John@email.com'}

fferri
  • 18,285
  • 5
  • 46
  • 95
Sri
  • 85
  • 4

4 Answers4

4

Don't use list as a variable name. list is the name of a Python builtin class, and it is used in the following solution. I renamed your list variable l.

This solution consists of first building a multidict (using collections.defaultdict(list)) to store the multiple values:

import collections
d = collections.defaultdict(list)
for entry in l:
    key, value = entry.split(':', 2)
    d[key].append(value)

now d contains:

{'name': ['John'], 'age': ['25'], 'Address': ['Chicago', 'Phoenix', 'Washington'], 'email': ['John@email.com']}

then iterate the values of d, and if more than one, append a suffix:

output = {}
for key, values in d.items():
    if len(values) > 1:
        for i, value in enumerate(values):
            output[f'{key}{i+1}'] = value
    else:
        output[key] = values[0]

output:

{'name': 'John', 'age': '25', 'Address1': 'Chicago', 'Address2': 'Phoenix', 'Address3': 'Washington', 'email': 'John@email.com'}

fferri
  • 18,285
  • 5
  • 46
  • 95
  • 1
    This is what I would do as well. Note one can use `setdefault()` here and avoid the import by defining `d = {}` then `d.setdefault(key, []).append(value)`. Of course, I assume you know that and I'm just noting it for other future viewers – JonSG Aug 17 '23 at 16:36
  • 1
    @JonSG: true, but I find `defaultdict` a tad more readable; and in this specific example, I felt that mentioning the `list` type was a valuable teaching example :-) – fferri Aug 17 '23 at 16:55
3

Another solution (this iterates over the list only once):

lst = [
    "name:John",
    "age:25",
    "Address:Chicago",
    "Address:Phoenix",
    "Address:Washington",
    "email:John@email.com",
]

cnts, out = {}, {}
for k, v in map(lambda s: s.split(":"), lst):
    c = cnts.get(k, 0)
    if c == 0:
        out[k] = v
    elif c == 1:
        out[f"{k}1"] = out.pop(k)
        out[f"{k}2"] = v
    else:
        out[f"{k}{c + 1}"] = v

    cnts[k] = c + 1

print(out)

Prints:

{
    "name": "John",
    "age": "25",
    "Address1": "Chicago",
    "Address2": "Phoenix",
    "Address3": "Washington",
    "email": "John@email.com",
}
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
1

You can use something like this:

list_ = ['name:John','age:25','Address:Chicago','Address:Phoenix','Address:Washington','email:John@email.com']

dic = {}
for i in list_:
    j = i.split(':')
    key_ = j[0]
    count = 0 # counts the number of duplicates
    while key_ in dic:
        count += 1
        key_ = j[0] + str(count)
    dic[key_] = j[1]

Output:

{'name': 'John',
 'age': '25',
 'Address': 'Chicago',
 'Address1': 'Phoenix',
 'Address2': 'Washington',
 'email': 'John@email.com'}

PS. don't use the python keyword list to name your variables as it overrides the type list

Suraj Shourie
  • 536
  • 2
  • 11
1

You could first separate values from keys in two lists, then make the keys list unique by adding suffixes and combine the unique keys with the values into a dictionary at the end:

data = ['name:John','age:25','Address:Chicago',
        'Address:Phoenix','Address:Washington','email:John@email.com']

keys,values = zip(*(s.split(":") for s in data))
keys        = [ k+str(keys[:i].count(k))*(keys.count(k)>1) 
                for i,k in enumerate(keys,1) ]
dic         = dict(zip(keys,values))

print(dic)

{'name': 'John', 
 'age': '25', 
 'Address1': 'Chicago', 
 'Address2': 'Phoenix', 
 'Address3': 'Washington', 
 'email': 'John@email.com'}

Note that this does not cover cases where the suffixed keys clash with original keys. For example: ["Address1:...","Address:...","Address:..."] would produce a duplicate "Address1" by adding a suffix to the "Address" key. If that situation could exist in your data, a different approach would be needed

Alternatively, you can use a dictionary to group values in lists associated with each key and then expand this group dictionary to produce distinct keys:

grp = dict()
grp.update( (k,grp.get(k,[])+[v]) for s in data for k,v in [s.split(":")] )
dic = { k+str(i or ''):v for k,g in grp.items() 
                         for i,v in enumerate(g,len(g)>1) }

print(dic)

{'name': 'John', 
 'age': '25', 
 'Address1': 'Chicago', 
 'Address2': 'Phoenix', 
 'Address3': 'Washington', 
 'email': 'John@email.com'}

Although grp itself may actually be easier to manipulate in subsequent code:

print(grp)

{'name':    ['John'], 
 'age':     ['25'], 
 'Address': ['Chicago', 'Phoenix', 'Washington'], 
 'email':   ['John@email.com']}
Alain T.
  • 40,517
  • 4
  • 31
  • 51