How to skip a lot of text or values in 2 files and do another task with the data

Question

In the following code I wanted to skip the content (a lot of not usable content in files ex1.idl and ex2.idl) and get to the data with which to work. This data begins at the 905th value on each line of each file. Snippets of the files are:

ex1.idl

0.11158E-13 0.11195E-13 0.11233E-13 ...

ex2.idl

0.11010E-13 0.11070E-13 0.11117E-13 ...

I can successfully skip the unneeded values. I can also do some splitting, slicing and calculating. But when I combine the two, the code does not seem to work. The following is the combined code that I have:

with open('ex1.idl') as f1, open('ex2.idl') as f2:
    with open('ex3.txt', 'w') as f3:

        a = 905                           #the first part
        f1 = f1.readlines(905:)[a-1:]     #the first part
        f2 = f2.readlines(905:)[a-1:]     #the first part

        f1 = map(float, f1.read().strip().split())              #the second part
        f2 = map(float, f2.read().strip().split())              #the second part
        for result in map(lambda v: v[0]/v[1], zip(f1, f2)):    #the second part
            f3.write(str(result)+"\n")                          #the second part

This is the code where I just read the data and do the splitting and calculating alone. This works:

with open('primer1.idl') as f1, open('primer2.idl') as f2:
with open('primer3.txt', 'w') as f3:

    f1 = map(float, f1.read().strip().split())
    f2 = map(float, f2.read().strip().split())
    for result in map(lambda v: v[0]/v[1], zip(f1, f2)):

        f3.write(str(result)+"\n")

So I only want to add that the program starts the reading and computing at line 905.

Thanks in advance for the answer.

Would you mind sharing what the files `ex1.idl` and `ex2.idl` look like? Are they multiline files? Or are they each one line with values separated by a space, just like [`here`](https://stackoverflow.com/questions/45732274/calculating-the-quotient-between-two-files-and-writing-it-into-another-file)? — Abdou, Aug 17 '17 at 18:01
They are **multiline files**, where there are words describing the data set with no particular value (species, units....junk if you will) then somewere another dataset is implemented in the same way (but it is not important) then at the **905th line the useful data starts**, as you described before. I have tryed implementing [905:] (the first answer) before I asked the question, but it stated: float division by zero. — Robert, Aug 17 '17 at 19:57
So besides the junk lines, how many lines with actual data are contained in each file? — Abdou, Aug 17 '17 at 20:06
The data which I need to analyse can be found from the line 905 to 7399 in both files the same. — Robert, Aug 17 '17 at 20:10

score 1 · Answer 1 · answered Aug 25 '17 at 10:40

I have done some work here and found that this also works:

with open('ex1.idl') as f1, open('ex2.idl') as f2:
    with open('ex3.txt', 'w') as f3:

        start_line = 905        #reading from this line forward
        for i in range(start_line - 1):
            next(f1)
            next(f2)
        f1 = list(map(float, f1.read().split()))
        f2 = list(map(float, f2.read().split()))
        for result in map(lambda v : v[0]/v[1], zip(f1,f2)):
            f3.write((str(result)) + '\n')

Abdou · Accepted Answer · 2017-08-17T20:33:16.090

Try this:

from itertools import islice


with open('ex1.idl') as f1, open('ex2.idl') as f2:
    with open('ex3.txt', 'w') as f3:
        f1 = islice(f1, 905, None)  # skip first 905 lines
        f2 = islice(f2, 905, None)  # skip first 905 lines

        for f1_line, f2_line in zip(f1, f2):
            f1_vals = map(float, f1_line.strip().split())
            f2_vals = map(float, f2_line.strip().split())
            for result in map(lambda v: v[0]/v[1], zip(f1_vals, f2_vals)):
                f3.write(str(result)+"\n")

This ignores the first 904 values in each file. Then it zips the contents of the files together so that the corresponding lines in each file are put together in the same tuple. Then you can loop through this zipped data and split each line on the space, convert the values to floating point values and then do the division.

Please note that if a line in either file contains zeros, then you will likely get a float division by zero Exception. Please make sure you don't have zeros in the files. If it is impossible to make sure that you do not have zeros in there, then you should handle that with a try-except and then skip:

with open('ex1.idl') as f1, open('ex2.idl') as f2:
    with open('ex3.txt', 'w') as f3:
        f1 = islice(f1, 905, None)  # skip first 905 lines
        f2 = islice(f2, 905, None)  # skip first 905 lines

        for f1_line, f2_line in zip(f1, f2):
            f1_vals = map(float, f1_line.strip().split())
            f2_vals = map(float, f2_line.strip().split())
            for v1, v2 in zip(f1_vals, f2_vals):
                try:
                    result = v1/v2
                    f3.write(str(result)+"\n")
                except ZeroDivisionError:
                    print("Encountered a value equal to zero in the second file. Skipping...")
                    continue

You can do whatever else you like, other than skipping. That's up to you.

How to skip a lot of text or values in 2 files and do another task with the data

ex1.idl

ex2.idl

2 Answers2