1

I have a file containing numbers and 2 words : "start" and "middle" I want to read numbers from "start" to "middle" in one array and numbers from "middle" to end of the file into another array. this is my python code:

with open("../MyList","r") as f:
        for x in f.readlines():
            if x == "start\n":
                continue
            if x == "middle\n":
                break
            x = x.split("\n")[0]
            list_1.append(int(x)) 

        print list_1

        for x in f.readlines():
            if x == "middle\n":
                continue
            list_2.append(int(x))

        print list_2

but the problem is that my program never enters second loop and jumps to

print list_2

I searched in older questions but can not figure out the problem.

Shi.Bi
  • 57
  • 7

4 Answers4

2

Your first loop is reading the entire file to the end, but processes only half of it. When the second loop hits, the file pointer is already at the end, so no new lines are read.

From the python docs:

file.readlines([sizehint])

Read until EOF using readline() and return a list containing the lines thus read. If the optional sizehint argument is present, instead of reading up to EOF, whole lines totalling approximately sizehint bytes (possibly after rounding up to an internal buffer size) are read. Objects implementing a file-like interface may choose to ignore sizehint if it cannot be implemented, or cannot be implemented efficiently.

Either process everything in one loop, or read line-by-line (using readline instead of readlines).

Community
  • 1
  • 1
Henning Koehler
  • 2,456
  • 1
  • 16
  • 20
  • as i tested and understood readline() reads one character at a time not one line and xreadlines() reads one line at a time. (see https://www.peterbe.com/plog/blogitem-040312-1) – Shi.Bi Nov 26 '16 at 16:37
  • readline() reads one line. From [the online documentation](https://docs.python.org/2/library/stdtypes.html#file-objects): `file.readline([size]) - Read one entire line from the file`. xreadlines returns an iterator. – Henning Koehler Nov 26 '16 at 20:47
2

Its because you are reading the whole at the 1st loop, when it enter 2nd loop, file pointer is already at end of file and you will get an empty list from f.readlines().

You can fix that either by reopen the file or set the file pointer to the beginning of file again with f.seek(0) before the 2nd for loop

with open("../MyList","r") as f:
    with open("../MyList","r") as f:
        for x in f.readlines():
            # process your stuff for 1st loop

        # reset file pointer to beginning of file again
        f.seek(0)

        for x in f.readlines():
            # process your stuff for 2nd loop

it will be not so efficient by reading whole file into memory if you are processing large file, you can just iterate over the file object instead of read all into memory like code below

list1 = []
list2 = []
list1_start = False
list2_start = False
with open("../MyList","r") as f:
    for x in f:
        if x.strip() == 'start':
            list1_start = True
            continue
        elif x.strip() == 'middle':
            list2_start = True
            list1_start = False
            continue

        if list1_start:
            list1.append(x.strip())
        elif list2_start:
            list2.append(x.strip())

print(list1)
print(list2)
Skycc
  • 3,496
  • 1
  • 12
  • 18
2

You can read the whole file once in a list and later you can slice it.

if possible you can try this:

with open("sample.txt","r") as f:
        list_1 = []
        list_2 = []
        fulllist = []
        for x in f.readlines():
            x = x.split("\n")[0]
            fulllist.append(x)

        print fulllist 

        start_position = fulllist.index('start')
        middle_position = fulllist.index('middle')
        end_position = fulllist.index('end')
        list_1 = fulllist[start_position+1 :middle_position] 
        list_2 = fulllist[middle_position+1 :end_position]
        print "list1 : ",list_1
        print "list2 : ",list_2 
Shivkumar kondi
  • 6,458
  • 9
  • 31
  • 58
1

Discussion

Your problem is that you read the whole file at once, and when you start the 2nd loop there's nothing to be read...

A possible solution involves reading the file line by line, tracking the start and middle keywords and updating one of two lists accordingly.

This imply that your script, during the loop, has to mantain info about its current state, and for this purpose we are going to use a variable, code, that's either 0, 1 or 2 meaning no action, append to list no. 1 or append to list no. 2, Because in the beginning we don't want to do anything, its initial value must be 0

code = 0

If we want to access one of the two lists using the value of code as a switch, we could write a test or, in place of a test, we can use a list of lists, lists, containing a dummy list and two lists that are updated with valid numbers. Initially all these inner lists are equal to the empty list []

l1, l2 = [], []
lists = [[], l1, l2]

so that later we can do as follows

    lists[code].append(number)

With these premises, it's easy to write the body of the loop on the file lines,

  • read a number
  • if it's not a number, look if it is a keyword
    • if it is a keyword, change state
    • in any case, no further processing
  • if we have to append, append to the correct list

        try:
            n = int(line)
        except ValueError:
            if line == 'start\n' : code=1
            if line == 'middle\n': code=2
            continue
        if code: lists[code].append(n)
    

We have just to add a bit of boilerplate, opening the file and looping, that's all.

Below you can see my test data, the complete source code with all the details and a test execution of the script.

Demo

$ cat start_middle.dat
1
2
3
start
5
6
7
middle
9
10

$ cat start_middle.py
l1, l2 = [], []
code, lists = 0, [[], l1, l2]

with open('start_middle.dat') as infile:
    for line in infile.readlines():
        try:
            n = int(line)
        except ValueError:
            if line == 'start\n' : code=1
            if line == 'middle\n': code=2
            continue
        if code: lists[code].append(n)

print(l1)
print(l2)
$ python start_middle.py 
[5, 6, 7]
[9, 10]
$ 
gboffi
  • 22,939
  • 8
  • 54
  • 85