15

There are already a lot of questions about sorting dictionaries but I can't find the right answer to my question.

I have the dictionary v:

v = {3:4.0, 1:-2.0, 10:3.5, 0:1.0}

We have to turn the dictionary v into a sorted list.

lijst(v) = [1.0, -2.0, 0.0, 4.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 3.5]

I have tried working with this code:

def lijst(x):
    return sorted(x.items(), key=lambda x: x[1])

This is the list I receive:

lijst(v) = [(1, -2.0), (0, 1.0), (10, 3.5), (3, 4.0)]

Does anyone know how to convert this into a list of values sorted in order of their key, with the missing values padded with zero?

itseva
  • 583
  • 1
  • 4
  • 13

7 Answers7

11

Just use itertools.chain.from_iterable to flatten your result (the list of tuples):

>>> import itertools

>>> list(itertools.chain.from_iterable([(1, -2.0), (0, 1.0), (10, 3.5), (3, 4.0)]))
[1, -2.0, 0, 1.0, 10, 3.5, 3, 4.0]

In case I misunderstood your original request and the dictionary represents a "sparse vector" (where the keys are the indices) you could simply populate a list containing only zeros:

>>> res = [0.0]*(max(v)+1)       # create a dummy list containing only zeros
>>> for idx, val in v.items():   # populate the requested indices
...     res[idx] = val 
>>> res
[1.0, -2.0, 0.0, 4.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 3.5]

Or if you have NumPy you could also avoid the for-loop:

>>> import numpy as np

>>> arr = np.zeros(max(v)+1)
>>> arr[list(v.keys())] = list(v.values())
>>> arr
array([ 1. , -2. ,  0. ,  4. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  3.5])

The last approach relies on the fact that even though the order of keys and values is arbitrary they nevertheless directly correspond as long as there is no modification of the dictionary:

Keys and values are iterated over in an arbitrary order which is non-random, varies across Python implementations, and depends on the dictionary’s history of insertions and deletions. If keys, values and items views are iterated over with no intervening modifications to the dictionary, the order of items will directly correspond.

Source 4.10.1. Dictionary view objects

MSeifert
  • 145,886
  • 38
  • 333
  • 352
7

You can try this using chain from itertools:

from itertools import chain

v = {3:4.0, 1:-2.0, 10:3.5, 0:1.0}

final_output = list(chain(*sorted(v.items(), key=lambda x: x[1])))

Output:

[1, -2.0, 0, 1.0, 10, 3.5, 3, 4.0]
Ajax1234
  • 69,937
  • 8
  • 61
  • 102
5

One way to concatenate the (key, value) pairs is by using sum() with an initial value:

>>> sum(sorted(v.items(), key=lambda x:x[1]), ())
(1, -2.0, 0, 1.0, 10, 3.5, 3, 4.0)

This returns a tuple. Pass it to list() if you really, really need a list.

P.S. As rightly pointed out by @MSeifert in the comments, this almost certainly has O(n**2) time complexity whereas list(chain(...)) is likely amortized linear.

NPE
  • 486,780
  • 108
  • 951
  • 1,012
4

Another option is to use the yield from syntax introduced in Python 3.3:

>>> lst = [(1, -2.0), (0, 1.0), (10, 3.5), (3, 4.0)]
>>> list([(yield from tup) for tup in lst])
[1, -2.0, 0, 1.0, 10, 3.5, 3, 4.0]
>>> 

Caveat: Note that using yield from this way inside of list comprehension may not be "offical syntax" and some (including Guido) consider it a bug.

Christian Dean
  • 22,138
  • 7
  • 54
  • 87
  • Even though it's a fun alternative (+1): It's generally not considered pythonic to use a list comprehension for it's side-effects. – MSeifert Aug 03 '17 at 11:05
  • @MSeifert Really? I didn't think this was using a list comprehension for side-effects. My understanding is that `[(yield from tup) for tup in lst]` returns a generator expression. This is passed to `list()` which repeatedly calls `next` on the generator and `(yield from tup)` is returned each time. Because of that, it seemed like it was using the actually list comprehension and not just it's side-effects. Correct me if I'm wrong? – Christian Dean Aug 03 '17 at 16:07
  • I'm not exactly sure but the list-comprehension does in fact create a list of `None`s (even though you never see them because they are attached to the `StopIteration` in the end) and the outer `list()` just catches the yielded values. But I've [seen people calling it a bug](https://stackoverflow.com/q/32139885/5393381) (but I'm not clear which part is the "buggy part") so I'm definitely not sure if it should be advocated as a syntax that's officially supported. – MSeifert Aug 03 '17 at 19:05
  • @MSeifert Ah, I see. Thanks. I'll add a caveat to my answer concerning the nature of `yield from` in comprehensions. – Christian Dean Aug 03 '17 at 19:08
  • 1
    Note: This won't work in 3.8 and it is deprecated in 3.7 – MSeifert May 30 '18 at 21:30
2

You can use list-comprehension to achieve what you want, for example:

if you want to keep 0.0 place holders for items that aren't available:

[v.get(i, 0.0) for i in range(max(v.keys())+1)]

output:

[1.0, -2.0, 0.0, 4.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 3.5]

If you don't want 0.0 place holders you can use:

[v.get(i) for i in range(max(v.keys())+1) if v.get(i) is not None]

output:

[1.0, -2.0, 4.0, 3.5]

Explanation:

when you use range() it will generate a sorted list so you don't have to worry about sorting, then it will try to get items from dictionary as per that list. In the first example if the key doesn't exist a 0.0 will be returned while in the 2nd example None will be returned and will be ignored because of the if-statement in the expression.

EDIT:

As Christian mentioned, you can change the 2nd option for more efficiency to:

[v[i] for i in range(max(v.keys())+1) if i in v]

This will avoid calling v.get(i) twice.

Mohd
  • 5,523
  • 7
  • 19
  • 30
  • 1
    Your second solution must make two calls to `dict.get` and thus is inefficient. You can instead simply test if `i` is in `v`'s keys and then add `v[i]` if so: `[v[i] for i in range(max(v.keys()) + 1) if i in v]`. This method is faster by about a factor of two. – Christian Dean Aug 02 '17 at 22:52
0

This is not strictly answering the question but rather trying to understand what you may be trying to achieve. If you are trying to implement sparse vectors, before spending time on a new implementation you may want to look into scipy.sparse.

For example:

from scipy.sparse import dok_matrix
v = {3:4.0, 1:-2.0, 10:3.5, 0:1.0}
m = dok_matrix((11,1))
m.update(v)

The advantage of sparse matrices is that (depending on the fraction of nonzero elements) they may take less memory and/or allow faster computations.

Luca Citi
  • 1,310
  • 9
  • 9
-2
v = {3:4.0, 1:-2.0, 10:3.5, 0:1.0}
print sorted(v.values())

Result

[-2.0, 1.0, 3.5, 4.0]
asanoop24
  • 449
  • 4
  • 13