3

I have two files, file1 and file2, and for every file1 line, I’m trying to check against all the lines of file2.

So I did a nested enumerate for loop, but checking the first line of file1 against all the lines of file2, the program just completes, rather than moving onto the next file1 line to check against all the lines of file2.

Here is what I have:

def testing():
    file1 = open(‘file1.txt’, 'r')
    file2 = open(‘file2.txt’, 'r')

    for index1, line1 in enumerate(file1):
        for index2, line2 in enumerate(file2):
                #              Here it prints the first line of `file1.txt` and after checking against all the lines of `file2.txt` (`for index2, line2 in enumerate(file2.txt)`) it completes, rather then going to the outer loop and proceed looping
                print("THIS IS OUTER FOR LOOP LINE: " + line1 + " WITH LINE NUMBER: " + str(index1))

    file1.close()
    file2.close()

How can I check every line of file1 against all the lines of file2? What could I be doing wrong here?

Thank you in advance and will be sure to upvote/accept answer

Jo Ko
  • 7,225
  • 15
  • 62
  • 120
  • you could use `file2.readlines()` to load the contents of the second file into memory and iterate over that data then – Felix Apr 06 '17 at 18:17

3 Answers3

4

Push the position of file2 back to the start at the top of each loop. Either close and reopen it as aryamccarthy suggested, or do it the cleaner way by simply moving the pointer:

file1 = open(‘file1.txt’, 'r')
file2 = open(‘file2.txt’, 'r')

for index1, line1 in enumerate(file1):
    file2.seek(0)  # Return to start of file
    for index2, line2 in enumerate(file2):
Prune
  • 76,765
  • 14
  • 60
  • 81
1

You need to reopen file2 on every iteration. Move that code inside the loop. Otherwise, you reach the end of file2 after the first outer iteration, and you have nothing left to iterate over in the next round of the outer loop.

Arya McCarthy
  • 8,554
  • 4
  • 34
  • 56
0

I would keep every line of each file in separate lists

with open(file1) as f:
    content_1 = f.readlines()

with open(file2) as f:
    content_2 = f.readline()

and then proceed to compare the lists

for index1, line1 in enumerate(content_1):
    for index2, line2 in enumerate(content_2):
        # do your stuff here
Mateo Torres
  • 1,545
  • 1
  • 13
  • 22
  • With this, you have to store the full contents of each file in memory, which can be risky (or fail) for very large input files. – Arya McCarthy Apr 06 '17 at 18:19
  • agreed, this is not an ideal solution for very big files, but for files of a couple thousand lines it should be ok... when dealing with very big files a better (but slower) idea would be to reopen the second file every time – Mateo Torres Apr 06 '17 at 18:22