How to split line in readlines and save them in different list?

Question

this is my code

with open('file.txt', 'r') as source:
    # Indentation
    polTerm = [line.strip().split()[0] for line in source.readlines()]
    polFreq = [int(line.strip().split()[1]) for line in source.readlines()]

this is inside file.txt

anak 1
aset 3
atas 1
bangun 1
bank 9
benar 1
bentuk 1

I got the polTerm just like what I want:

['anak', 'aset', 'atas', 'bangun', 'bank', 'benar', 'bentuk']

but for the polFreq, instead of this:

['1', '3', '1', '1', '9', '1', '1']

what I got is blank list like this:

[ ]

anyone know why this happened? and how to fix this so I can get just like I what I want.

Does this answer your question? [Using "readlines()" twice in a row](https://stackoverflow.com/questions/10201008/using-readlines-twice-in-a-row) — Carcigenicate, Nov 04 '19 at 00:50
thanks for your answer! another question, is there a better way to do it beside this? — prs_wjy, Nov 04 '19 at 00:53
thanks!, i just add `lines = source.readlines()` before `polTerm = [line.strip().split()[0] for line in source.readlines()]` and change all `source.readlines` to `lines` — prs_wjy, Nov 04 '19 at 00:59
You have 4 answers, which seem solve your issue. You still need help or advice on some details on these answers? — gelonida, Nov 09 '19 at 00:43

score 1 · Answer 1 · answered Nov 04 '19 at 00:55

1

with open('file.txt', 'r') as source:
    lines = source.readlines()
    polTerm = [line.strip().split()[0] for line in lines]
    polFreq = [int(line.strip().split()[1]) for line in lines]

The reason is that readlines() is an iterator, so the first call has already consumed it and it becomes empty and when you try to use that empty iterator the second time you find it empty.

answered Nov 04 '19 at 00:55

moctarjallo

1,479
1
16
33

1

You could unindent the last two lines. They don't have to be in the `with` statement. This would close the file a little earlier – gelonida Nov 04 '19 at 01:01
@gelonida i know. But that's another question, an efficiency question. If you answer two questions at the same time it could be a plus of complexity for the asker. I just barely modified his code so that he can follow along and figure out where he was getting mistaken. – moctarjallo Nov 04 '19 at 01:14
True. I just thinks it's easier to read and makes it clearer that the file access is done (that the file is only read once). But of course in this case it's not really important. – gelonida Nov 04 '19 at 01:38

score 1 · Accepted Answer · answered Nov 04 '19 at 01:02

As Carcigenicate said, .readlines is a generator that returns a list. If you don't save that list in a variable, calling a generator a second time will return nothing, because the generator has been exhausted in your first call. What you want is this:

with open("file.txt","r") as inf:
    # Now your lines list is saved in a global variable 
    # which can be used outside with open().
    # The .readlines generator is exhausted and won't return 
    # anything.
    raw = inf.readlines()

polTerm = [line.strip().split()[0] for line in raw]
polFreq = [int(line.strip().split()[1]) for line in raw]

Pro tip: Learn to use pandas, specifically, pd.read_csv().

Juan David · Answer 3 · 2019-11-04T06:05:00.927

0

with open('file.txt','r') as source:
     data=source.readlines()
a1=[] 
a2=[] 
for line in data:
     x=line.split()
     a1.append(x[0])
     a2.append(x[1])

edited Nov 04 '19 at 06:05

answered Nov 04 '19 at 00:55

Juan David

430
4
17

4

You could unindent immediately after the `data=` line. This would close the file a little earlier and reduce the indentation level (In my opinion normally a good idea) – gelonida Nov 04 '19 at 01:01
That is true, thanks for the advice, it's very useful regarding efficiency. – Juan David Nov 04 '19 at 06:06

gelonida · Answer 4 · 2019-11-04T01:36:34.877

@Carcgenicate gives you the literal answer.

However in my opinion you just shouldn't read the file twice (Except the file is really huge and all of its lines wouldn't fit into memory.

If the files are not that huge there is no need to read in a file twice. If it is a little huge, then just read the first two columns into memory. and separate them afterwards.

What I'd suggest is:

with open('file.txt', 'r') as source:
    cols_1_and_2 = [line.strip().split(None, 2)[:2] for line in source.readlines()]

polTerm = [cols[0] for cols in cols_1_and_2]
polFreq = [int(cols[1]) for cols in cols_1_and_2]
del cols_1_and_2  # this line is to free some memory if that would be an issue

How to split line in readlines and save them in different list?

4 Answers4