parsing blocks of text within a text file (blocks separated by empty line)

Question

I'm a brand new Python user who wants to parse a text file that looks like this:

$ begin
$ vertex -1285.6 -226.2 -538.7 0
$ track 11 1000 0.7277 0.6765 0.1133 0
$ end

$ begin
$ vertex -1265.3 573.1 1547.7 0
$ track 11 1000 -0.7679 0.1650 0.6189 0
$ end

For every block ($ begin ... $ end) I want to get vertex coordinates x y z:

$ begin
$ vertex x y z 0
$ track 11 1000 -0.7679 0.1650 0.6189 0
$ end

Can somebody suggest a way to do this? I am very grateful for any help or advice!

Iterate through the first column and check for vertex.. If vertex is found then grab your co-ordinates and store it in a list. — Teja, Apr 14 '17 at 21:07
Hi Teja, Thanks for the advice. Can you provide a code sample if its not too much work for you? — user638699, Apr 14 '17 at 21:09

alpeshpandya · Answer 1 · 2017-04-14T21:15:10.860

3

You can use regex here.

patern = re.compile("\n\n")
print patern.split(content)

Explanation: This will look for patern of two consecutive new line characters in your string and split into array with that pattern

For example:

   with open('data.txt', 'r') as myfile:
       str=myfile.read()
       #str="hello world \n line 1 \n line 2 \n\n line 3 line 4 \n line 5"
       print str
       patern = re.compile("\n\n")
       print patern.split(str)

Result: ['hello world \n line 1 \n line 2 ', ' line 3 line 4 \n line 5']

edited Apr 14 '17 at 21:15

answered Apr 14 '17 at 21:09

alpeshpandya

492
3
12

Hi alpeshpandya, Thanks! Can you explain what this does? I'm a little confused. Maybe can you provide an example? – user638699 Apr 14 '17 at 21:10
Added explanation and example – alpeshpandya Apr 14 '17 at 21:15

score 2 · Accepted Answer · answered Apr 14 '17 at 21:18

Let's presume you have a text file called my file.txt with your data in it. And let's give labels to each of the items in the row:

marker = $
label = vertex OR track OR begin, etc
x = your x value
y = your y value
z = your z value
eol = the last value on the vertex line

As we read each line, we can check to see if the line includes the term 'vertex'.

If it does, we then split that line using the split function (be default, split will split on whitespace, but let's explicitly call out what we want to split on (ie. ' '). Split produces a list of elements.

We can the use tuple unpacking to break each item out of the list and assign them individual labels so our code is more readable. We can then print the values. In your case, you probably want to save or process those values... just replace the print statement with your processing code.

file = open('myfile.txt')
for line in file:
    if 'vertex' in line:
        fields = line.split(' ')
        marker, tag, x, y, z, eol = fields
        print(x, y, z)

Brilliant, thank you!!! I awarded you the check mark because you gave such a clear explanation. — user638699, Apr 14 '17 at 21:22

score 1 · Answer 3 · answered Apr 14 '17 at 21:22

1

import csv

with open('data.txt','r') as f:
     text = f.readlines()
for line in text:
    if 'vertex' in line:
        fields = line.split(' ')
        print(fields[2],fields[3],fields[4])

answered Apr 14 '17 at 21:22

Teja

13,214
36
93
155

why are you importing csv? – Sekuraz Apr 15 '17 at 00:16

score 0 · Answer 4 · answered Apr 14 '17 at 21:46

0

Short and sweet with a touch or re

import re
verticies = re.findall('\$ vertex (\S+) (\S+) (\S+) 0', open('t.data').read())
print verticies

Gives:

[('-1285.6', '-226.2', '-538.7'), ('-1265.3', '573.1', '1547.7')]

answered Apr 14 '17 at 21:46

sotapme

4,695
2
19
20

parsing blocks of text within a text file (blocks separated by empty line)

4 Answers4