-1

I have a dictionary with a lot of keys that are different from each other because of the dictionary's case sensitivity. I'd like to have it all in one lower case key, but with all values of those keys aggregated.

I have something like:

>>> data
{'Blue Car': 73, 'blue Car': 21, 'yellow car': 10, 'Yellow Car': 15, 'Red Car': 12, 'Red car': 17, 'red car': 10, 'Yellow car': 18}

And the output should be like:

>>> newData
{'blue car': 94, 'yellow car': 43, 'red car': 39}
Vinícius Figueiredo
  • 6,300
  • 3
  • 25
  • 44
  • 1
    Have you made any attempts? There are tons of similar questions on StackOverflow, did you try any of those approaches? – juanpa.arrivillaga Jan 31 '17 at 18:34
  • I'm sorry if this is a duplicate, I couldn't find any question that was about this specific problem, the "case insensitive dictionary" ones were not solving my problem. – Vinícius Figueiredo Jan 31 '17 at 18:47

5 Answers5

2

Use defaultdict:

from collections import defaultdict

newData = defaultdict(int)

for k in data:
    newData[k.lower()]+=data.get(k,0)

# {'blue car': 94, 'red car': 39, 'yellow car': 43}

I hope this helps.

Abdou
  • 12,931
  • 4
  • 39
  • 42
1

How about using a defaultdict:

from collections import defaultdict
newData = defaultdict(int)
for k,v in data.iteritems():
    newData[k.lower()] += v
arshajii
  • 127,459
  • 24
  • 238
  • 287
1

try this

def compress(data):
    newDict = dict()
    for key in data:
        newDict[key.lower()] = newDict.get(key.lower(), default=0) + data[key]
    return newDict
Tom
  • 304
  • 4
  • 9
1

Using dictionaries and set comprehensions:

>>> {x: sum(v for k, v in data.items() if k.lower()==x) for x in set(map(lambda x: x.lower(), data))}
{'red car': 39, 'blue car': 94, 'yellow car': 43}

or more user friendly:

SET = set(map(lambda x: x.lower(), data))
SUM = lambda x: sum(v for k, v in data.items() if k.lower()==x)
newData = {x: SUM(x) for x in SET}

# newData = {'red car': 39, 'blue car': 94, 'yellow car': 43}

Explained:

SET = set(map(lambda x: x.lower(), data))

obtains all unique lowercase keys,

SUM = lambda x: sum(v for k, v in data.items() if k.lower()==x)

returns the sum of the values for keys in data matching the unique key, and

{x: SUM(x) for x in SET}

will match this value as a part of pair with the matching key, for every key in the set.

Uriel
  • 15,579
  • 6
  • 25
  • 46
1

I would subclass dict and override the __getitem__ and __setitem__ magic methods

class NormalizedDict(dict):
    def __getitem__(self,key):
        return dict.__getitem__(self,key.lower())
    def __setitem__(self,key,value):
        return dict.__setitem__(self,key.lower(),value)

myDict = NormalizedDict()
myDict['aPPles'] =5
print myDict

of coarse we can take this further and autosum for you

class NormalizedSumDict(NormalizedDict):
    def __setitem__(self,key,value):
        if key.lower() in self and type(self[key]) == type(value):
           try:
              value = value + self[key]
           except:
              pass
        NormalizedDict.__setitem__(self,key,value)
    def update(self,other):
        for k,v in other.items():
            self[k] = v

d = NormalizedSumDict()
d['aPPles']=5
d['Apples']=2
print d
Joran Beasley
  • 110,522
  • 12
  • 160
  • 179