unpacking a split inside a list comprehension

Question

If I want to generate a list of tuples based on elements of lines of a document, i can do :

[(line.split()[0], line.split()[-1][3:8]) for line in open("doc.txt")]

for example (i added the slicing to show that I might want use some operations on the elements of the split).

Still I would like to avoid using split two times, because that's unefficient.
So I wanted to use something like unpacking, with

[(linesplit0, linesplit1[3:8]) for line in open("doc.txt") for (linesplit0, linesplit1) in line.split()]

but that can't work since there are no tuples in the split, so at each element of the split we will be lacking one element.

What I would like is something that allows to use a placeholder name for the list resulting of the split (like splittedlist or whatever), and that could be used with indexing (splittedlist[0]), or unpacking or both), and that would be compatible with the comprehension list syntax.

Is it feasible?

`[(lambda words:(words[0], words[-1][3:8]))(line.split()) for line in open("doc.txt")]` — Karoly Horvath, Apr 27 '17 at 11:59
Use a nested generator expression: `[(ls[0], ls[-1][3:8]) for ls in (line.split() for line in open('doc.txt')]`. Put the generator expression on a separate line in a variable if need be for readability. — Martijn Pieters, Apr 27 '17 at 12:16
Sorry for the duplicate, but formulation did not help to find it. — Ando Jurai, Apr 27 '17 at 12:42

score 4 · Accepted Answer · answered Apr 27 '17 at 11:50

4

You can use map (python3) or itertools.imap (python2) over open:

[(line[0], line[-1][3:8]) for line in map(str.split, open("doc.txt"))]

or use a generator:

[(line[0], line[-1][3:8]) for line in ( l.split() for l in open("doc.txt"))]

answered Apr 27 '17 at 11:50

Netwave

40,134
6
50
93

Thanks, that's nice too. I have another answer with a generator but yours use instead another syntax. This is not nested since nested is [a' for b in c for a in b] While yours is [a' for a in (b' for b in c)] (baring the intermediate processing which could be whatever you want and marked by a (" ' ") beside the letter). So how would you call this kind of syntax? "cascading"? has it a specific name to distinguish it from nested? – Ando Jurai Apr 27 '17 at 12:03
1

There are two different kinds of nesting. This one nests one distinct generator inside another. Without parentheses, you have a single generator, but one that uses a nested comprehension iterator (the `for` part of the comprehension). – chepner Apr 27 '17 at 12:10
1

The generator syntax returns an object generator, so `(x for x in whatever)` return a generator object Which then it's used in the list comprehension @AndoJurai – Netwave Apr 27 '17 at 12:12
I validate your answer as best for these explanations, but I should also point to this answer (http://stackoverflow.com/a/43656986/4218755) as it explains the interest of using a generator made in a function. – Ando Jurai Apr 27 '17 at 12:30

score 2 · Answer 2 · answered Apr 27 '17 at 11:51

You can use map with the unbound method str.split:

[(linesplit[0], linesplit[-1][3:8]) for linesplit in map(str.split, open("doc.txt"))]

However I'd stay away from these; I'd instead use a generator:

def read_input(filename):
    with open(filename) as f:
        for line in f:
            parts = line.split()
            yield parts[0], parts[-1][3:8]

It might be a bit more, but it is easier to follow - and readability counts - and the user has a choice between using read_input('doc.txt') as such, or wrapping it into a list if needed.

Clever. By "using it as such" would you mean for example as an iterator? What would be the other use case of a generator per se? — Ando Jurai, Apr 27 '17 at 11:59

unpacking a split inside a list comprehension

2 Answers2

Linked