Research a string with a variable number in python

Question

I have a text file which contains several lines in the following format:

ELEMENT=      1 PLY=  1
-----------------------
 Code 1425                                    
    GP= 1  4.324E-03 -1.350E-03 -2.974E-03  3.084E-04  0.000E+00  0.000E+00
    GP= 2  1.435E-03 -3.529E-04 -1.082E-03  1.183E-04  0.000E+00  0.000E+00
    GP= 3  7.742E-03 -3.542E-03 -4.200E-03  4.714E-04  0.000E+00  0.000E+00
    GP= 4  4.842E-03 -2.378E-03 -2.463E-03  3.040E-04  0.000E+00  0.000E+00

The number after the word ELEMENT goes from 1 to 60. My first goal is to read this text file and stop to every occurrence of the word ELEMENT = 1 to ELEMENT = 60

My test script reads the first occurrence of ELEMENT. I would now like to go through the 60 occurrences of ELEMENT, so I have tried to implement a variable following ELEMENT, in this example I have initialized it to 2 to see if it would work and as you can guess it doesn't (see example code below).

elem= 2
lines = open("myfile.txt", "r" ).readlines()

for line in lines:
 if re.search( r"ELEMENT=      %i" (line, elem) ):
   words = line.split()

   energy = float( words[1] )

   print "%f" % energy
   break

I get the following error code:

File "recup.py", line 42, in <module>
if re.search( r"ELEMENT=      %i" (line, elem) ):
TypeError: 'str' object is not callable

My question then is how would I implement a variable into my search?

"read this text file and stop to every occurence of the word "ELEMENT= 1" to "ELEMENT= 60"" <- what is this supposed to mean? I tried to read your question three times and am still confused about what you want to do. — timgeb, Jul 07 '14 at 15:06
_"... to see if it would work and as you can guess it doesn't"_. So, does it crash, or what? — Kevin, Jul 07 '14 at 15:09
I have added the error code returned. In fact my goal is to extract from a large text file only the 60 data block shown in my post. I hope this is clearer. — Timothée S, Jul 07 '14 at 15:33
Thank you all for your answers. If i could ask one more thing: What should i do to have my script to begin searching my text file starting not from the beginning, but from a line that contains a certain string ? — Timothée S, Jul 08 '14 at 07:18

score 0 · Accepted Answer · answered Jul 07 '14 at 15:18

Just iterate over the blocks:

import re

txt='''\
ELEMENT=      1 PLY=  1
-----------------------
 Code 1425                                    
    GP= 1  4.324E-03 -1.350E-03 -2.974E-03  3.084E-04  0.000E+00  0.000E+00
    GP= 2  1.435E-03 -3.529E-04 -1.082E-03  1.183E-04  0.000E+00  0.000E+00
    GP= 3  7.742E-03 -3.542E-03 -4.200E-03  4.714E-04  0.000E+00  0.000E+00
    GP= 4  4.842E-03 -2.378E-03 -2.463E-03  3.040E-04  0.000E+00  0.000E+00

ELEMENT=      2 PLY=  22
-----------------------
 Code 1426                                 
    GP= 5  4.324E-03 -1.350E-03 -2.974E-03  3.084E-04  0.000E+00  0.000E+00
    GP= 6  1.435E-03 -3.529E-04 -1.082E-03  1.183E-04  0.000E+00  0.000E+00
    GP= 7  7.742E-03 -3.542E-03 -4.200E-03  4.714E-04  0.000E+00  0.000E+00
    GP= 8  4.842E-03 -2.378E-03 -2.463E-03  3.040E-04  0.000E+00  0.000E+00    
    '''

for i, m in enumerate(re.finditer(r'^ELEMENT=\s+(\d+.*?)(?=^ELEMENT|\Z)', txt, re.M | re.S)):
    print 'Group {}===:\n{}'.format(i, m.group(1))

This will find the blocks of lines stating with ELEMENT and ending either with the next block or the end of the file. Then parse the block found into whatever.

Prints:

Group 0===:
1 PLY=  1
-----------------------
 Code 1425                                    
    GP= 1  4.324E-03 -1.350E-03 -2.974E-03  3.084E-04  0.000E+00  0.000E+00
    GP= 2  1.435E-03 -3.529E-04 -1.082E-03  1.183E-04  0.000E+00  0.000E+00
    GP= 3  7.742E-03 -3.542E-03 -4.200E-03  4.714E-04  0.000E+00  0.000E+00
    GP= 4  4.842E-03 -2.378E-03 -2.463E-03  3.040E-04  0.000E+00  0.000E+00


Group 1===:
2 PLY=  22
-----------------------
 Code 1426                                 
    GP= 5  4.324E-03 -1.350E-03 -2.974E-03  3.084E-04  0.000E+00  0.000E+00
    GP= 6  1.435E-03 -3.529E-04 -1.082E-03  1.183E-04  0.000E+00  0.000E+00
    GP= 7  7.742E-03 -3.542E-03 -4.200E-03  4.714E-04  0.000E+00  0.000E+00
    GP= 8  4.842E-03 -2.378E-03 -2.463E-03  3.040E-04  0.000E+00  0.000E+00

thanks for the answer. When i try to run your code I get the following error: `File "./stackover.py", line 25, in print 'Group {}===:\n{}'.format(i, m.group(1)) ValueError: zero length field name in format` What did I do wrong ? — Timothée S, Jul 09 '14 at 13:36
You have Python 2.6? If so, you need to do `'Group {0}===:\n{1}'.format(i, m.group(1))` Note the 0 and 1 inside the curly braces. Or upgrade to 2.7 -- it has been out 4 years now... — dawg, Jul 09 '14 at 14:07
Yes I have python 2.6. But we work on virtual machines at work and we have an old version of Ubuntu and from what I read, I shouldn't upgrade my python version. Thank you for answer. — Timothée S, Jul 09 '14 at 14:57
You can keep the system Python in place and install a new version in a local or virtual directory. — dawg, Jul 09 '14 at 15:35

score 0 · Answer 2 · answered Jul 07 '14 at 15:18

I'm not entirely sure what you're trying to do, but if you're trying to test which iteration of ELEMENT you're on, this would be a better way:

elem= 2
lines = open("myfile.txt", "r" ).readlines()

for line in lines:
  if re.match(r"ELEMENT=",line):
    words = line.split()
    if int(words[1]) == elem:
      # Do whatever you're trying to do.

score 0 · Answer 3 · answered Jul 07 '14 at 15:20

If the line you search always starts with "ELEMENT" there is an easy way to work around this :

lines = open("myfile.txt", "r").readlines()
for line in lines:
  if line.startswith("ELEMENT"):
    words = line.split()
    print "ELEMENT : " + words[1] + ", PLY : " + words[3]

Using this you will print the line contents everytime you find an "ELEMENT" line. You can easily extract the "CODE" and "GP" line contents using the same trick ;).

score 0 · Answer 4 · answered Jul 07 '14 at 15:21

a few simple changes:

elem= 2
lines = open("myfile.txt", "r" ).readlines()

for line in lines:
    words = line.split()
    if words[0].startswith('ELEMENT'):
        energy = int( words[1] )
        if energy == elem:
            break

print "%f" % energy
break

Don't try to compare == floats - it seldon turns out well

score 0 · Answer 5 · answered Jul 07 '14 at 15:24

0

If I understand your question correctly, you can "plant" a variable into the search like this:

if re.search( r"ELEMENT=      {}".format(elem), line ):

answered Jul 07 '14 at 15:24

Nir Alfasi

53,191
11
86
129

Research a string with a variable number in python

5 Answers5