4

I'm not sure I'm thinking about this problem correctly. I'd like to write a function which takes a list with duplicates and appends an iterating suffix to "dedup" the list.

For example:

dup_list = ['apple','banana','cherry','banana','cherry','orange','cherry']

Aiming to return:

deduped = ['apple','banana1','cherry1','banana2','cherry2','orange','cherry3']

My instinct was to use the pop function while iterating over the list with a while statement, like so:

def dedup_suffix(an_list):
dedup=[]
for each in an_list:
    an_list.pop(an_list.index(each)) #pop it out
    i=1 #iterator  
    while each in an_list:
        an_list.pop(an_list.index(each))
        i+=1
        appendage=str(each)+"_"+str(i)
    else:
        appendage=str(each)
    dedup.append(appendage)
return dedup

But:

>>> dedup_suffix(dup_list)

['apple', 'cherry', 'orange']

Appreciate any pointers.

JMcClure
  • 701
  • 1
  • 8
  • 16

2 Answers2

4

You can use a Counter to keep track of the number of occurrences. I'm assuming your example is correct with respect to apple, so that you don't want to add a zero to the first occurrence. For that you need a bit of logic:

from collections import Counter
counter = Counter()

dup_list = ['apple','banana','cherry','banana','cherry','orange','cherry']
deduped = []
for name in dup_list:
    new = name + str(counter[name]) if counter[name] else name
    counter.update({name: 1})
    deduped.append(new)
Thomas Fenzl
  • 4,342
  • 1
  • 17
  • 25
1

You can count the number of duplicates using a collections.Counter object. Then make a new list by iterating through that

dup_list = ['apple','banana','cherry','banana','cherry','orange','cherry']
c = Counter(dup_list)

dedup=[]
for w in c:
    n = c[w]
    if n == 1:
        dedup.append(w)
    else:
        for i in range(1,n+1):
            dedup.append(w+str(i))