9

I have two lists, x and y:

>>> x = [2, 3, 4]
>>> y = [1, 2, 3]

I want to use these to create a new list. The new list will have each element in x repeated the number of times specified by the corresponding element in y. Hence, the desired output is

>>> new_list
[2, 3, 3, 4, 4, 4]

The order of the elements in new_list doesn't matter to me. It's also not crucial that it be a list -- any sequence type is fine.

What is the fastest, most efficient, most Pythonic way to achieve this?

abcd
  • 10,215
  • 15
  • 51
  • 85
  • 3
    Asking for the "fastest, most efficient, most Pythonic way" is like asking for the "fastest, safest, and most purple car". While it's nice when the most Pythonic way is the fastest, caring overmuch about whether it's the fastest is as unpythonic as it gets. – DSM Oct 28 '15 at 03:49

4 Answers4

13

numpy's repeat function gets the job done:

>>> import numpy as np
>>> x = [2, 3, 4]
>>> y = [1, 2, 3]
>>> np.repeat(x, y)
array([2, 3, 3, 4, 4, 4])
abcd
  • 10,215
  • 15
  • 51
  • 85
12
  1. You can use list comprehension, like this

    >>> x = [2, 3, 4]
    >>> y = [1, 2, 3]
    >>> [item for item, count in zip(x, y) for i in range(count)]
    [2, 3, 3, 4, 4, 4]
    

    Here, we zip the x and y so that the element from x and its corresponding count from y are grouped as a single tuple. Then, we iterate count number of items to produce the same item.

  2. If your objects in x are immutables, then you can create count copies of the same and put them together in a list, like this

    >>> [i for item, count in zip(x, y) for i in [item] * count]
    [2, 3, 3, 4, 4, 4]
    
  3. You can do the same lazily, with itertools.repeat, like this

    >>> from itertools import chain, repeat
    >>> chain.from_iterable((repeat(item, count) for item, count in zip(x,y)))
    <itertools.chain object at 0x7fabe40b5320>
    >>> list(chain.from_iterable((repeat(item, cnt) for item, cnt in zip(x,y))))
    [2, 3, 3, 4, 4, 4]
    

    Please note that the chain returns an iterable, not a list. So, if you don't want all the elements at once, you can get the items one by one from it. This will be highly memory efficient if the count is going to be a very big number, as we don't create the entire list in the memory immediately. We generate the values on-demand.

  4. Thanks ShadowRanger. You can actually apply repeat over x and y and get the result like this

    >>> list(chain.from_iterable(map(repeat, x, y)))
    [2, 3, 3, 4, 4, 4]
    

    here, map function will apply the values from x and y to repeat one by one. So, the result of map will be

    >>> list(map(repeat, x, y))
    [repeat(2, 1), repeat(3, 2), repeat(4, 3)]
    

    Now, we use chain.from_iterable to consume values from each and every iterable from the iterable returned by map.

Community
  • 1
  • 1
thefourtheye
  • 233,700
  • 52
  • 457
  • 497
  • 4
    The `chain` approach can simplify to: `list(chain.from_iterable(map(repeat, x, y)))` (probably want to do `from future_builtins import map` if this is Py2, unnecessary on Py3). – ShadowRanger Oct 28 '15 at 03:48
  • Could also use `list(chain.from_iterable(starmap(repeat, izip(x, y))))` as well, with `izip -> zip` for python3.x ... Not sure if that's better though... – mgilson Oct 28 '15 at 03:51
  • @mgilson, any opportunity to use `starmap` must be considered Pythonic – John La Rooy Oct 28 '15 at 04:06
1

Simple using for loop.

>>> x = [2, 3, 4]
>>> y = [1, 2, 3]
>>> final = []
>>> for index, item in enumerate(y):
        final.extend([x[index]]*item)
Sagar Rakshe
  • 2,682
  • 1
  • 20
  • 25
1

One way to achieve this is via using .elements() function of collections.Counter() along with zip. For example:

>>> from collections import Counter

>>> x = [2, 3, 4]
>>> y = [1, 2, 3]

# `.elements()` returns an object of `itertool.chain` type, which is an iterator.
# in order to display it's content, here type-casting it to `list` 
>>> list(Counter(dict(zip(x,y))).elements())
[2, 3, 3, 4, 4, 4]
Moinuddin Quadri
  • 46,825
  • 13
  • 96
  • 126