10

i have a file that has a specific line of interest (say, line 12) that looks like this:

conform: 244216 (packets) exceed: 267093 (packets)

i've written a script to pull the first number via regex and dump the value into a new file:

getexceeds = open("file1.txt", "r").readlines()[12]
output = re.search(r"\d+", getexceeds).group(0)

with open("file2.txt", "w") as outp:
    outp.write(output)

i am not quite good enough yet to return the second number in that line into a new file -- can anyone suggest a way?

thanks as always for any help!

captain yossarian
  • 447
  • 3
  • 10
  • 22

2 Answers2

13

Another possibility would be to use re.findall() which returns a list:

>>>m = re.findall("\d+", strg) 
>>>m
['244216', '267093']
Roman
  • 156
  • 5
  • 1
    This answer will also work, yes -- but your regex \d{6} is no good. OP's input suggests it's a number of packets. There's no reason to assume that it will always be a 6-digit number. – FrobberOfBits Sep 18 '14 at 13:46
  • Good point. Thanks for that FrobberOfBits. I edited my post accordingly. – Roman Sep 18 '14 at 13:48
12

You've got it almost all right; your regex is only looking for the first match though.

match = re.search(r"(\d+).*?(\d+)", getexceeds)
firstNumber = match.group(1)
secondNumber = match.group(2)

Notice that the regex is looking for two capturing groups (in parens) both a sequence of digits. What's between is just anything - .*? means some minimal number of any characters.

Here's a little test I ran from the shell:

>>> str = 'conform: 244216 (packets) exceed: 267093 (packets)'
>>> match = re.search(r"(\d+).*?(\d+)", str)
>>> print match.group(1)
244216
>>> print match.group(2)
267093
FrobberOfBits
  • 17,634
  • 4
  • 52
  • 86
  • One suggestion: do NOT call a variable `str` (or a name of a type, like `list`), because this will overwrite the `str` default type and will cause improper algorithm behavior. – SilentCloud Apr 06 '21 at 07:51