1

I have files that contain both strings and floats. I am interested in finding the floats after a specific string. Any help in writing such a function that reads the file look for that specific string and returns the float after it will be much appreciated.

Thanks

An example of a file is

lines = """aaaaaaaaaaaaaaa  bbbbbbbbbbbbbbb  cccccccccc
qq vvv rrr ssssa 22.6
zzzzx bbbb 12.0
xxxxxxxxxx -1.099
zzzz bbb nnn 33.5"""
import re

lines = """aaaaaaaaaaaaaaa  bbbbbbbbbbbbbbb  cccccccccc
qq vvv rrr ssssa 22.6
zzzzx bbbb 12.0
xxxxxxxxxx -1.099
zzzz bbb nnn 33.5"""

str_to_search = 'xxxxxxxxxx'
num = re.findall(r'^' + str_to_search + r' (\d+\.\d+)', lines, flags=re.M)
print(num)

This works if there are no negative signs. In other words, if the number after the string 'xxxxxxxxxx' is 1.099 rather than '-1.099', it works fine. The question I have is how to generalize so it accounts for negative numbers as well given that it can be positive number (no sign in this case) or a negative number (with a negative sign in this case)

AbuStack
  • 39
  • 5
  • 1
    Duplicate of https://stackoverflow.com/questions/74506916/how-to-return-the-remaining-a-line-in-a-file-after-a-specific-string – AcK Nov 20 '22 at 12:14
  • 1
    Does this answer your question? [How to return the remaining a line in a file after a specific string](https://stackoverflow.com/questions/74506916/how-to-return-the-remaining-a-line-in-a-file-after-a-specific-string) – AcK Nov 20 '22 at 12:17

3 Answers3

3

You can use regex

(-?\d+\.?\d*)

import re

lines = """aaaaaaaaaaaaaaa  bbbbbbbbbbbbbbb  cccccccccc
qq vvv rrr ssssa 22.6
zzzzx bbbb 12.0
xxxxxxxxxx -1.099
zzzz bbb nnn 33.5
xxxxxxxxxx 1.099"""

str_to_search = "xxxxxxxxxx"
num = re.findall(fr"(?m)^{str_to_search}\s+(-?\d+\.?\d*)", lines)
print(num)

Prints:

['-1.099', '1.099']
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
2

You can change the regex to following:

num = re.findall(r'^' + str_to_search + r' (-?\d+\.?\d*)', lines, flags=re.M)
1

I would just split the entire filecontent at every space. This will give us a list of all strings and floats. Then use list.index(" ") to find the index of the string you are searching for, put that into try/except to make sure your code wont stop if the string is not in the contents. Then just read the next element and try to convert it to a float. Code:

lines = """aaaaaaaaaaaaaaa  bbbbbbbbbbbbbbb  cccccccccc
qq vvv rrr ssssa 22.6
zzzzx bbbb 12.0
xxxxxxxxxx -1.099
zzzz bbb nnn 33.5"""

lines = lines.replace("\n", " ").split(" ") # replace the newlines with spaces to split them as well

try:
    float_index = lines.index("xxxxxxxxxx") + 1 # Get the element after the string you are trying to find

    num = float(lines[float_index])
except Exception as e:
    print(e)

print(num)

If you are looking for a solution in regex, use Andrej Kesely's awnser.

MarshiDev
  • 36
  • 3