0

I am trying to parse a text file that has data like below:

============================= condition 1 ============================

condition 2 string

col 1 col2 tags
------------------------------
      xx xx abc
      xx xx ac

col4 col 1       col5 col6        col7       col8     col9      col10     col11        col12   col13
-----------------------------------------------------------------------------------------------------
     1  11        6    30         abc        text    -2794      682         57388      294      210
     2  11        6    30         ac         text    -2447      680         52973      302      214
     3  11        13                         text    -2619     -805        120956      568      255
     4  11        16                         text     2185    -1261        116983      548      273
     5  11        17                         text    -3362    -1413        127136      569      278

Criterion 30 : xxxxx         text3 11 : some text here

============================================================================

Below are the things that I want to do

  1. look for condition 1 and if that is met, make sure condition 2 string is present
  2. pick the values in the column 'tags'
  3. and then look for these tags in the table below to extract information from columns 9 to 13.

I can do the third part however, I am struggling with the first two as when I use f.next() to check the condition 2 it is messing up my code :

with open(each_file) as f:
    copy = False
    i = 0
    for linenum,line in enumerate(f):
        if line.strip() == "============================= Condition 1 ============================":
            line_next = f.next()
            if line_next.strip() == "condition 2 string":
                print "here1"
                print line.strip()
                copy = True ## Start copying when both the conditions are met

            elif line_next.strip() == "col4 col 1       col5 col6        col7       col8     col9      col10     col11        col12   col13": ## Stop copying at this line
                if i == 0:
                    copy = False
                else:
                    copy = False
                i = i + 1

        elif copy:
            print copy
            print line

Please help me with this.

martineau
  • 119,623
  • 25
  • 170
  • 301
wonder
  • 3
  • 4
  • 1
    I think you correctly identified your issue (`f.next()` is popping, not peeking). If your file isn't too big, you could read it all into memory (e.g. as a list `lines`) and check `lines[linenum+1]`. Alternatively, you could set a flag when you find "==== Condition 1 ====", and then the next iteration, check that both the flag and "condition 2" are present. I'd prefer the first option for all but the biggest of input files, it'll lead to cleaner code, IMHO. – jedwards May 01 '18 at 15:28
  • 1
    Possible duplicate of [Parsing Text Files](https://stackoverflow.com/questions/35776471/parsing-text-files) – quamrana May 01 '18 at 15:30
  • @jedwards the flag idea worked, thank you so much :) – wonder May 01 '18 at 16:03

1 Answers1

0

This should do what you want:

with open(each_file) as f:
    cond_1 = False
    copy = False
    for linenum,line in enumerate(f.readlines()):
        line = line.strip()
        #print("DEBUG: line is <{0}>".format(line))
        if line == "============================= condition 1 ============================":
            print "DEBUG: condition 1"
            cond_1 = True

        elif cond_1 and line == "condition 2 string":
            print "DEBUG: condition 2 / start copying"
            copy = True ## Start copying when both the conditions are met

        elif line == "col4 col 1       col5 col6        col7       col8     col9      col10     col11        col12   col13": ## Stop copying at this line
            print "DEBUG: stop copying"
            copy = False

        if copy:
            #print "DEBUG: Copying..."
            print line
curusarn
  • 403
  • 4
  • 11