Keep first occurrence in duplicated values in dictionary

Question

Having duplicated values in dictionary such as the following:

dict_numbers={'one':['one', 'first','uno', 'une'],
            'zero':['zero','nothing','cero'],
            'first':['one', 'first','uno', 'une'],
            'dos':['two','second','dos', 'segundo','deux'],
            'three':['three','tres','third','tercero'],
            'second':['two','second','dos','segundo','deux'],
            'forth':['four','forth', 'cuatro','cuarto'],
            'two': ['two','second','dos', 'segundo','deux'],
            'segundo':['two','second','dos', 'segundo','deux']}

I'd like to get the first occurrences of keys that have duplicated values. Notice, that the dictionary does not have duplicated keys, but duplicated values. In this example, I would get a list keeping the first occurrence of duplicated values:

list_numbers_no_duplicates=['one','zero','dos','three','forth']

first key is removed because one has already the same values. second key is removed because dos has already the same values. two key is removed because dos has already the same values.

How to keep track of the several duplicates in the values of keys?

Thanks in advance

can you explain `'dos'` and `'forth'` in the expected output? — Marat, Oct 06 '20 at 01:41
`dos` is the first key with the value `['two','second','dos', 'segundo','deux']`, `two` and `segundo` keys are also in the dictionary, but they have the same value `['two','second','dos', 'segundo','deux']`, they appear after `dos`, so they are not part of the final_list. — John Barton, Oct 06 '20 at 01:58

score 1 · Accepted Answer · answered Oct 06 '20 at 02:21

Hopefully I understood correctly your goal. The following uses chain from the ever-so-useful itertools package.

    >>> {key: vals for i, (key, vals) in enumerate(dict_numbers.items()) 
        if key not in chain(*list(dict_numbers.values())[:i])}
    {'one': ['one', 'first', 'uno', 'une'], 
    'zero': ['zero', 'nothing', 'cero'], 
    'dos': ['two', 'second', 'dos', 'segundo', 'deux'], 
    'three': ['three', 'tres', 'third', 'tercero'], 
    'forth': ['four', 'forth', 'cuatro', 'cuarto']}

Essentially, this works by recreating the original dictionary for entries where there are no occurrence where the key is found in any of the preceding lists (hence the enumerate and slicing shenanigans).

Keep first occurrence in duplicated values in dictionary

1 Answers1