I am parsing through an ISI file with a few hundred records that all begin with a 'PT J
' tag and end with an 'ER
' tag. I am trying to pull the tagged info from each record within a nested loop but keep getting an IndexError. I know why I am getting it, but does anyone have a better way of identifying the start of new records than checking the first few characters?
while file:
while line[1] + line[2] + line[3] + line[4] != 'PT J':
...
Search through and record data from tags
...
I am using this same method and therefore occasionally getting the same problem with identifying tags, so if you have any suggestions for that as well I would greatly appreciate it!
Sample data, which you'll notice does not always include every tag for each record, is:
PT J
AF Bob Smith
TI Python For Dummies
DT July 4, 2012
ER
PT J
TI Django for Dummies
DT 4/14/2012
ER
PT J
AF Jim Brown
TI StackOverflow
ER