Python: Take Every First, Second, Third Element in Sublist

Question

I'm using Python 2.7 and have the following:

my_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

I'd like create a 1-d list where the elements are ordered by position in sublist and then order of sublist. So the correct output for the above list is:

[1, 4, 7, 2, 5, 8, 3, 6, 9]

Here's my (incorrect) attempt:

def reorder_and_flatten(my_list):
    my_list = [item for sublist in my_list for item in sublist]
    result_nums = []
    for i in range(len(my_list)):
        result_nums.extend(my_list[i::3])
    return result_nums
result = reorder_and_flatten(my_list)

This flattens my 2-d list and gives me:

[1, 4, 7, 2, 5, 8, 3, 6, 9, 4, 7, 5, 8, 6, 9, 7, 8, 9]

The first half of this list is correct but the second isn't.

I'd also like my function to be able to handle only 2 sublists. For instance, if given:

[[1, 2, 3], [], [7, 8, 9]

the correct output is:

[1, 7, 2, 8, 3, 9]

Any thoughts?

Thanks!

abarnert · Accepted Answer · 2018-09-06T00:19:35.140

You're attempting to flatten, and then reorder, which makes things a lot harder than reordering and then flattening.

First, for your initial problem, that's just "unzip", as explained in the docs for zip:

>>> my_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> list(zip(*my_list))
... [(1, 4, 7), (2, 5, 8), (3, 6, 9)]

(In Python 2.7, you could just write zip(…) here instead of list(zip(…)), but this way, the same demonstration works identically in both 2.x and 3.x.)

And then, you already know how to flatten that:

>>> [item for sublist in zip(*my_list) for item in sublist]
[1, 4, 7, 2, 5, 8, 3, 6, 9]

But things get a bit more complicated for your second case, where some of the lists may be empty (or maybe just shorter?).

There's no function that's like zip but skips over missing values. You can write one pretty easily. But instead… there is a function that's like zip but fills in missing values with None (or anything else you prefer), izip_longest. So, we can just use that, then filter out the None values as we flatten:

>>> my_list = [[1, 2, 3], [], [7, 8, 9]]
>>> from itertools import izip_longest
>>> list(izip_longest(*my_list))
[(1, None, 7), (2, None, 8), (3, None, 9)]
>>> [item for sublist in izip_longest(*my_list) for item in sublist if item is not None]
[1, 7, 2, 8, 3, 9]

(In Python 3, the function izip_longest is renamed zip_longest.)

It's worth noting that the roundrobin recipe, as covered by ShadowRanger's answer, is an even nicer solution to this problem, and even easier to use (just copy and paste it from the docs, or pip install more_itertools and use it from there). It is a bit harder to understand—but it's worth taking the time to understand it (and asking for help if you get stuck).

If `None` is a legal element, you'd want to make `sentinel = object()`, and pass `sentinel` as the `fillvalue` for `zip_longest`, as well as testing `item is not sentinel`, so there is no possibility of dropping any input value. — ShadowRanger, Sep 05 '18 at 23:45
@ShadowRanger Yeah, I didn't want to get into that in the answer (since the OP is using all ints), hoping "… (or anything else you prefer) …" would be enough of a hint for future searchers, but it definitely does belong at least in a comment. — abarnert, Sep 05 '18 at 23:47

score 1 · Answer 2 · answered Sep 05 '18 at 23:41

1

result = [l[i] for i in range(max(len(v) for v in my_list)) for l in my_list if l]

i.e.

my_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
[l[i] for i in range(max(len(v) for v in my_list)) for l in my_list if l]
# => [1, 4, 7, 2, 5, 8, 3, 6, 9]

my_list = [[1, 2, 3], [], [7, 8, 9]]
[l[i] for i in range(max(len(v) for v in my_list)) for l in my_list if l]
# => [1, 7, 2, 8, 3, 9]

answered Sep 05 '18 at 23:41

mVChr

49,587
11
107
104

2

Note: If one of the sublists has a different, but non-zero, length from the others, this code will die with an `IndexError` (because it blithely indexes to the maximum index of any input sublist). – ShadowRanger Sep 05 '18 at 23:47

ShadowRanger · Answer 3 · 2018-09-05T23:49:16.773

The itertools module's recipes section provides a roundrobin recipe that would do exactly what you want. It produces a generator, but your expected behavior would be seen with:

# define roundrobin recipe here
from itertools import cycle, islice
def roundrobin(*iterables):
    "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
    # Recipe credited to George Sakkis
    pending = len(iterables)
    nexts = cycle(iter(it).next for it in iterables)
    while pending:
        try:
            for next in nexts:
                yield next()
        except StopIteration:
            pending -= 1
            nexts = cycle(islice(nexts, pending))

def reorder_and_flatten(my_list):
    return list(roundrobin(*my_list))

Your original code's main issue is that it looped over for i in range(len(my_list)):, extending with my_list[i::3]. Problem is, this ends up duplicating elements from index 3 onwards (index 3 was already selected as the second element of the index 0 slice). There are lots of other small logic errors here, so it's much easier to reuse a recipe.

This will be fairly performant, and generalize better than most hand-rolled solutions (it will round robin correctly even if the sublists are of uneven length, and it doesn't require second pass filtering or special handling of any kind to allow None as a value like zip_longest does).

You should probably copy-paste the recipe there. Not that it's likely to disappear from the docs any time soon, but still. Also, the OP is using 2.7, so better to link the 2.x docs. But otherwise, yeah, hard to beat this. — abarnert, Sep 05 '18 at 23:46
@abarnert: Missed it was Py2. Updated link, copied in recipe for posterity (they have deleted recipes before, so your paranoia is not misplaced; the `pairwise` recipe used to have a generalized [`window` recipe](https://docs.python.org/release/2.3.5/lib/itertools-example.html) for arbitrary sized pairings that disappeared for some reason). — ShadowRanger, Sep 05 '18 at 23:53
Wow, never noticed that `window` went away; I thought it just got renamed to `windowed` (which is the name `more_itertools` uses, but IIRC, they expanded the original recipe anyway, so it doesn't fail badly if the window size is larger than the iterable). But yeah, it's gone. — abarnert, Sep 06 '18 at 00:17

score 1 · Answer 4 · answered Sep 05 '18 at 23:52

If you are happy to use a 3rd party library, you can use NumPy and np.ndarray.ravel:

import numpy as np

A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

res_a = A.ravel('F')  # array([1, 4, 7, 2, 5, 8, 3, 6, 9])

For the case where you have one or more empty lists, you can use filter to remove empty lists:

B = np.array(list(filter(None, [[1, 2, 3], [], [7, 8, 9]])))

res_b = B.ravel('F')  # array([1, 7, 2, 8, 3, 9])

Both solutions require non-empty sublists to contain the same number of items. If list conversion is necessary you can use, for example, res_a.tolist().

While these "black box" methods won't teach you much, they will be faster for large arrays than list-based operations. See also What are the advantages of NumPy over regular Python lists?

Python: Take Every First, Second, Third Element in Sublist

4 Answers4