0

My question is different from the one asked here. Primarily I am asking what improvements could be made to code containing dictionaries. However, the link explains about memory profilers, which will be my next step.

I have the following two sets of code to achieve the same thing.

First one,

a={1: 'a', 2: 'b', 3: 'c', 4: 'd'}
b=[x for x in a if x in (1,2,3)]
b=['a', 'b', 'c']

Second one,

a={1: 'a', 2: 'b', 3: 'c', 4: 'd'}
c=[a[x] for x in set(a.keys()) & set([1,2,3])]
b=['a', 'b', 'c']

I would like to know which one works better in terms of memory optimized methods, and for large sets of data.

Thanks in advance!

Community
  • 1
  • 1
Annapoornima Koppad
  • 1,376
  • 1
  • 17
  • 30
  • Possible duplicate of [Which Python memory profiler is recommended?](http://stackoverflow.com/questions/110259/which-python-memory-profiler-is-recommended) – Craig Burgler Aug 28 '16 at 23:44
  • 1
    Both are pretty bad in terms of memory optimization. Dicts are specifically fast at key access and you're avoiding the fastest (and simplest) way of doing so, using subscripting. How many keys are we talking about? The fastest way to do your example would be `[a[1], a[2], a[3]]` or `[a[x] for x in (1, 2, 3)]` – Two-Bit Alchemist Aug 28 '16 at 23:44
  • 2
    I would imagine `[a[x] for x in (1, 2, 3) if x in a]` is faster than both. If `(1, 2, 3)` is really big, you can precompute it as a `frozenset` and just do `the_set.intersection(a)`. – Blender Aug 28 '16 at 23:46
  • @SamRedway It doesn't matter if the dict is 20,000 entries if we're only interested in 12 keys though. – Two-Bit Alchemist Aug 28 '16 at 23:48

2 Answers2

1

If you're optimizing for memory use generators are often a good tool. For example:

def get_keys(mapping, keys):
    for key in keys:
        try:
            yield mapping[key]
        except KeyError:
            continue

On your example:

list(get_keys(a, (1, 2, 3)))
['a', 'b', 'c']
Two-Bit Alchemist
  • 17,966
  • 6
  • 47
  • 82
0

If you want to ask among these two, second is better. But overall, both methods have something or other which could be improved. If I had to do something similar, I would have done:

>>> a = {1: 'a', 2: 'b', 3: 'c', 4: 'd'}
>>> [a[i] for i in (1,2,3) if i in a]
['a', 'b', 'c']
Moinuddin Quadri
  • 46,825
  • 13
  • 96
  • 126