5

If I want to generate a list of tuples based on elements of lines of a document, i can do :

[(line.split()[0], line.split()[-1][3:8]) for line in open("doc.txt")]  

for example (i added the slicing to show that I might want use some operations on the elements of the split).

Still I would like to avoid using split two times, because that's unefficient.
So I wanted to use something like unpacking, with

[(linesplit0, linesplit1[3:8]) for line in open("doc.txt") for (linesplit0, linesplit1) in line.split()]  

but that can't work since there are no tuples in the split, so at each element of the split we will be lacking one element.

What I would like is something that allows to use a placeholder name for the list resulting of the split (like splittedlist or whatever), and that could be used with indexing (splittedlist[0]), or unpacking or both), and that would be compatible with the comprehension list syntax.

Is it feasible?

Ando Jurai
  • 1,003
  • 2
  • 14
  • 29

2 Answers2

4

You can use map (python3) or itertools.imap (python2) over open:

[(line[0], line[-1][3:8]) for line in map(str.split, open("doc.txt"))]

or use a generator:

[(line[0], line[-1][3:8]) for line in ( l.split() for l in open("doc.txt"))]  
Netwave
  • 40,134
  • 6
  • 50
  • 93
  • Thanks, that's nice too. I have another answer with a generator but yours use instead another syntax. This is not nested since nested is [a' for b in c for a in b] While yours is [a' for a in (b' for b in c)] (baring the intermediate processing which could be whatever you want and marked by a (" ' ") beside the letter). So how would you call this kind of syntax? "cascading"? has it a specific name to distinguish it from nested? – Ando Jurai Apr 27 '17 at 12:03
  • 1
    There are two different kinds of nesting. This one nests one distinct generator inside another. Without parentheses, you have a single generator, but one that uses a nested comprehension iterator (the `for` part of the comprehension). – chepner Apr 27 '17 at 12:10
  • 1
    The generator syntax returns an object generator, so `(x for x in whatever)` return a generator object Which then it's used in the list comprehension @AndoJurai – Netwave Apr 27 '17 at 12:12
  • I validate your answer as best for these explanations, but I should also point to this answer (http://stackoverflow.com/a/43656986/4218755) as it explains the interest of using a generator made in a function. – Ando Jurai Apr 27 '17 at 12:30
2

You can use map with the unbound method str.split:

[(linesplit[0], linesplit[-1][3:8]) for linesplit in map(str.split, open("doc.txt"))]

However I'd stay away from these; I'd instead use a generator:

def read_input(filename):
    with open(filename) as f:
        for line in f:
            parts = line.split()
            yield parts[0], parts[-1][3:8]

It might be a bit more, but it is easier to follow - and readability counts - and the user has a choice between using read_input('doc.txt') as such, or wrapping it into a list if needed.

  • Clever. By "using it as such" would you mean for example as an iterator? What would be the other use case of a generator per se? – Ando Jurai Apr 27 '17 at 11:59