return the second instance of a regex search in a line

Question

i have a file that has a specific line of interest (say, line 12) that looks like this:

conform: 244216 (packets) exceed: 267093 (packets)

i've written a script to pull the first number via regex and dump the value into a new file:

getexceeds = open("file1.txt", "r").readlines()[12]
output = re.search(r"\d+", getexceeds).group(0)

with open("file2.txt", "w") as outp:
    outp.write(output)

i am not quite good enough yet to return the second number in that line into a new file -- can anyone suggest a way?

thanks as always for any help!

`re.findall(r"\d+", getexceeds)[1]` – Ashwini Chaudhary Sep 18 '14 at 13:31 — Ashwini Chaudhary, Sep 18 '14 at 13:31

Roman · Answer 1 · 2014-09-18T13:49:17.450

13

Another possibility would be to use re.findall() which returns a list:

>>>m = re.findall("\d+", strg) 
>>>m
['244216', '267093']

edited Sep 18 '14 at 13:49

answered Sep 18 '14 at 13:42

Roman

156
5

1

This answer will also work, yes -- but your regex \d{6} is no good. OP's input suggests it's a number of packets. There's no reason to assume that it will always be a 6-digit number. – FrobberOfBits Sep 18 '14 at 13:46
Good point. Thanks for that FrobberOfBits. I edited my post accordingly. – Roman Sep 18 '14 at 13:48

FrobberOfBits · Answer 2 · 2014-09-18T13:45:28.997

12

You've got it almost all right; your regex is only looking for the first match though.

match = re.search(r"(\d+).*?(\d+)", getexceeds)
firstNumber = match.group(1)
secondNumber = match.group(2)

Notice that the regex is looking for two capturing groups (in parens) both a sequence of digits. What's between is just anything - .*? means some minimal number of any characters.

Here's a little test I ran from the shell:

>>> str = 'conform: 244216 (packets) exceed: 267093 (packets)'
>>> match = re.search(r"(\d+).*?(\d+)", str)
>>> print match.group(1)
244216
>>> print match.group(2)
267093

edited Sep 18 '14 at 13:45

answered Sep 18 '14 at 13:32

FrobberOfBits

17,634
4
52
86

One suggestion: do NOT call a variable `str` (or a name of a type, like `list`), because this will overwrite the `str` default type and will cause improper algorithm behavior. – SilentCloud Apr 06 '21 at 07:51

return the second instance of a regex search in a line

2 Answers2

Linked