0

I have a text file which contains several lines in the following format:

ELEMENT=      1 PLY=  1
-----------------------
 Code 1425                                    
    GP= 1  4.324E-03 -1.350E-03 -2.974E-03  3.084E-04  0.000E+00  0.000E+00
    GP= 2  1.435E-03 -3.529E-04 -1.082E-03  1.183E-04  0.000E+00  0.000E+00
    GP= 3  7.742E-03 -3.542E-03 -4.200E-03  4.714E-04  0.000E+00  0.000E+00
    GP= 4  4.842E-03 -2.378E-03 -2.463E-03  3.040E-04  0.000E+00  0.000E+00

The number after the word ELEMENT goes from 1 to 60. My first goal is to read this text file and stop to every occurrence of the word ELEMENT = 1 to ELEMENT = 60

My test script reads the first occurrence of ELEMENT. I would now like to go through the 60 occurrences of ELEMENT, so I have tried to implement a variable following ELEMENT, in this example I have initialized it to 2 to see if it would work and as you can guess it doesn't (see example code below).

elem= 2
lines = open("myfile.txt", "r" ).readlines()

for line in lines:
 if re.search( r"ELEMENT=      %i" (line, elem) ):
   words = line.split()

   energy = float( words[1] )

   print "%f" % energy
   break

I get the following error code:

File "recup.py", line 42, in <module>
if re.search( r"ELEMENT=      %i" (line, elem) ):
TypeError: 'str' object is not callable

My question then is how would I implement a variable into my search?

  • 1
    "read this text file and stop to every occurence of the word "ELEMENT= 1" to "ELEMENT= 60"" <- what is this supposed to mean? I tried to read your question three times and am still confused about what you want to do. – timgeb Jul 07 '14 at 15:06
  • 2
    _"... to see if it would work and as you can guess it doesn't"_. So, does it crash, or what? – Kevin Jul 07 '14 at 15:09
  • I have added the error code returned. In fact my goal is to extract from a large text file only the 60 data block shown in my post. I hope this is clearer. – Timothée S Jul 07 '14 at 15:33
  • yeps, I had a good guess - see my answer below :) – Nir Alfasi Jul 07 '14 at 15:35
  • Thank you all for your answers. If i could ask one more thing: What should i do to have my script to begin searching my text file starting not from the beginning, but from a line that contains a certain string ? – Timothée S Jul 08 '14 at 07:18

5 Answers5

0

Just iterate over the blocks:

import re

txt='''\
ELEMENT=      1 PLY=  1
-----------------------
 Code 1425                                    
    GP= 1  4.324E-03 -1.350E-03 -2.974E-03  3.084E-04  0.000E+00  0.000E+00
    GP= 2  1.435E-03 -3.529E-04 -1.082E-03  1.183E-04  0.000E+00  0.000E+00
    GP= 3  7.742E-03 -3.542E-03 -4.200E-03  4.714E-04  0.000E+00  0.000E+00
    GP= 4  4.842E-03 -2.378E-03 -2.463E-03  3.040E-04  0.000E+00  0.000E+00

ELEMENT=      2 PLY=  22
-----------------------
 Code 1426                                 
    GP= 5  4.324E-03 -1.350E-03 -2.974E-03  3.084E-04  0.000E+00  0.000E+00
    GP= 6  1.435E-03 -3.529E-04 -1.082E-03  1.183E-04  0.000E+00  0.000E+00
    GP= 7  7.742E-03 -3.542E-03 -4.200E-03  4.714E-04  0.000E+00  0.000E+00
    GP= 8  4.842E-03 -2.378E-03 -2.463E-03  3.040E-04  0.000E+00  0.000E+00    
    '''

for i, m in enumerate(re.finditer(r'^ELEMENT=\s+(\d+.*?)(?=^ELEMENT|\Z)', txt, re.M | re.S)):
    print 'Group {}===:\n{}'.format(i, m.group(1))

This will find the blocks of lines stating with ELEMENT and ending either with the next block or the end of the file. Then parse the block found into whatever.

Prints:

Group 0===:
1 PLY=  1
-----------------------
 Code 1425                                    
    GP= 1  4.324E-03 -1.350E-03 -2.974E-03  3.084E-04  0.000E+00  0.000E+00
    GP= 2  1.435E-03 -3.529E-04 -1.082E-03  1.183E-04  0.000E+00  0.000E+00
    GP= 3  7.742E-03 -3.542E-03 -4.200E-03  4.714E-04  0.000E+00  0.000E+00
    GP= 4  4.842E-03 -2.378E-03 -2.463E-03  3.040E-04  0.000E+00  0.000E+00


Group 1===:
2 PLY=  22
-----------------------
 Code 1426                                 
    GP= 5  4.324E-03 -1.350E-03 -2.974E-03  3.084E-04  0.000E+00  0.000E+00
    GP= 6  1.435E-03 -3.529E-04 -1.082E-03  1.183E-04  0.000E+00  0.000E+00
    GP= 7  7.742E-03 -3.542E-03 -4.200E-03  4.714E-04  0.000E+00  0.000E+00
    GP= 8  4.842E-03 -2.378E-03 -2.463E-03  3.040E-04  0.000E+00  0.000E+00  
dawg
  • 98,345
  • 23
  • 131
  • 206
  • thanks for the answer. When i try to run your code I get the following error: `File "./stackover.py", line 25, in print 'Group {}===:\n{}'.format(i, m.group(1)) ValueError: zero length field name in format` What did I do wrong ? – Timothée S Jul 09 '14 at 13:36
  • You have Python 2.6? If so, you need to do `'Group {0}===:\n{1}'.format(i, m.group(1))` Note the 0 and 1 inside the curly braces. Or upgrade to 2.7 -- it has been out 4 years now... – dawg Jul 09 '14 at 14:07
  • Yes I have python 2.6. But we work on virtual machines at work and we have an old version of Ubuntu and from what I read, I shouldn't upgrade my python version. Thank you for answer. – Timothée S Jul 09 '14 at 14:57
  • You can keep the system Python in place and install a new version in a local or virtual directory. – dawg Jul 09 '14 at 15:35
0

I'm not entirely sure what you're trying to do, but if you're trying to test which iteration of ELEMENT you're on, this would be a better way:

elem= 2
lines = open("myfile.txt", "r" ).readlines()

for line in lines:
  if re.match(r"ELEMENT=",line):
    words = line.split()
    if int(words[1]) == elem:
      # Do whatever you're trying to do.
TheSoundDefense
  • 6,753
  • 1
  • 30
  • 42
0

If the line you search always starts with "ELEMENT" there is an easy way to work around this :

lines = open("myfile.txt", "r").readlines()
for line in lines:
  if line.startswith("ELEMENT"):
    words = line.split()
    print "ELEMENT : " + words[1] + ", PLY : " + words[3]

Using this you will print the line contents everytime you find an "ELEMENT" line. You can easily extract the "CODE" and "GP" line contents using the same trick ;).

Jerome
  • 1,225
  • 2
  • 12
  • 23
0

a few simple changes:

elem= 2
lines = open("myfile.txt", "r" ).readlines()

for line in lines:
    words = line.split()
    if words[0].startswith('ELEMENT'):
        energy = int( words[1] )
        if energy == elem:
            break

print "%f" % energy
break

Don't try to compare == floats - it seldon turns out well

gkusner
  • 1,244
  • 1
  • 11
  • 14
0

If I understand your question correctly, you can "plant" a variable into the search like this:

if re.search( r"ELEMENT=      {}".format(elem), line ):
Nir Alfasi
  • 53,191
  • 11
  • 86
  • 129