Matching a file in this form. It always begins with InvNo, ~EOR~ is End Of Record.
InvNo: 123
Tag1: rat cake
Media: d234
Tag2: rat pudding
~EOR~
InvNo: 5433
Tag1: strawberry tart
Tag5: 's got some rat in it
~EOR~
InvNo: 345
Tag2: 5
Media: d234
Tag5: rather a lot really
~EOR~
It should become
IN 123
UR blabla
**
IN 345
UR blibli
**
Where UR is a URL. I want to keep the InvNo as first tag. ** is now the end of record marker. This works:
impfile = filename[:4]
media = open(filename + '_earmark.dat', 'w')
with open(impfile, 'r') as f:
HASMEDIA = False
recordbuf = ''
for line in f:
if 'InvNo: ' in line:
InvNo = line[line.find('InvNo: ')+7:len(line)]
recordbuf = 'IN {}'.format(InvNo)
if 'Media: ' in line:
HASMEDIA = True
mediaref = line[7:len(line)-1]
URL = getURL(mediaref) # there's more to it, but that's not important now
recordbuf += 'UR {}\n'.format(URL))
if '~EOR~' in line:
if HASMEDIA:
recordbuf += '**\n'
media.write(recordbuf)
HASMEDIA = False
recordbuf = ''
media.close()
Is there a better, more Pythonic way? Working with the recordbuffer and the HASMEDIA flag seems, well, old hat. Any examples or tips for good or better practice?
(Also, I'm open to suggestions for a more to-the-point title to this post)