Capturing 0-n occurrences of a regex group

Question

I'm parsing output from textfiles that could change depending on when script is executed. I want to capture one or more lines of ip address, advertised address, and age. When I run my script it only captures the first or last match depending on where I place the metacharacters, however I want all the ip lines to be captured. The amount of ip lines can change depending on time of day. The rest remains the same.

Text

when I run this code the "link" capture group only captures the first or last ip line, after each "Link States" headers and column headers as a match~ it's capturing "link_state" match group just fine, and everything else is ok, however not capturing all of the ip lines ~ any suggestions on capturing them all as a group (i've included my IDLE output at the end)?

import re

temp = []
infile = ["ospf_r4.txt", "ospf_r1.txt"]


regex = (
    r"(?P<link_state>[\S+ ]* +Link States) ?(?:\(Area \S+\))?\n"
    r"\n*"
    r"(?P<link_col>Link ID) +(?P<adv_col>ADV Router) +(?P<age_col>Age) +Seq# +(?:[\S+ ]?)+"
    r"\n*"
    r"**(**(?P<link>[\d+\.]+ +[\d+\.]+ +[\d]+**)**+ +[\S+ *]*[\n]*)+"

        )
         


def temp_function(infile):
    global temp
    temp = []
    with open(infile, "r") as x:
        c = x.read()
        result = re.finditer(regex, c)
        for i in result:
            #temp.append(i.group())
            #print(i.group("link"))
            print(i)
        return temp

here is my output from IDLE, note instead of 8 ip's it gives me 5. Moving between finditer or findall has not worked for me. Any suggestions?

4.4.4.4 4.4.4.4 1993

172.16.14.2 4.4.4.4 1993

192.168.2.0 1.1.1.1 1031

3.3.3.3 1.1.1.1 1031

10.0.0.0 3.3.3.3 977

[Please post textual data directly instead of using screenshots.](https://meta.stackoverflow.com/questions/285551/why-not-upload-images-of-code-errors-when-asking-a-question) — CrazyChucky, Nov 01 '22 at 02:26
Won't findall with this simpler regex do the job ? myreg = r'[0-9]+.[0-9]+.[0-9]+.[0-9]+[^\w]+[0-9]+.[0-9]+.[0-9]+.[0-9]+[^\w]+[0-9]+' — Swifty, Nov 01 '22 at 15:42

score 0 · Answer 1 · answered Nov 01 '22 at 02:05

well I found one way of doing this

regex = (
r"[\S+\- ]* +Link States ?[\(Area \S+\)]*\n"
r"\n*"
r"(Link ID +ADV Router +Age) +Seq# +[\S+ ]*"
r"\n*"
r"(?:([\d+\.]+ +[\d+\.]+ +[\d+]+)?) *\S+ *\S+ *\S*\n*"
r"(?:([\d+\.]+ +[\d+\.]+ +[\d+]+)? +\S+ +\S+ *[\d+]*\n)?"
r"(?:([\d+\.]+ +[\d+\.]+ +[\d+]+)? +\S+ +\S+ *[\d+]*\n)?"
r"(?:([\d+\.]+ +[\d+\.]+ +[\d+]+)? +\S+ +\S+ *[\d+]*\n)?"
r"(?:([\d+\.]+ +[\d+\.]+ +[\d+]+)? +\S+ +\S+ *[\d+]*\n)?"
r"(?:([\d+\.]+ +[\d+\.]+ +[\d+]+)? +\S+ +\S+ *[\d+]*\n)?"
r"(?:([\d+\.]+ +[\d+\.]+ +[\d+]+)? +\S+ +\S+ *[\d+]*\n)?"
    )

I am wondering if there's a way to nest the regex into a single line, and to match on 1 or more matches? I'm trying to avoid a match of "None" if the match doesn't exist

As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). — Community, Nov 03 '22 at 14:00

Capturing 0-n occurrences of a regex group

1 Answers1