I am trying to extract several fields from a log file. I am having trouble with mixtures of IPv4 addresses, subnets and variables. So far I can only match one kind of field (i.e. IP or string).
import re
regex = re.search(
r'.*(?P<destination_address>\b((?:\d+\.){3}\d+(?:/\d+)?)|\w+)\b(?P<destination_port>\d+)?\b(?P<destination_options>\w)?(?=via|\Z|//)',
"Myfirewall add 50750 set Mycounter allow udp from any to 123.45.67.89/28 123 via someotheriface"
)
regex2 = re.search(
r'.*(?P<destination_address>\b((?:\d+\.){3}\d+(?:/\d+)?)|\w+)\b(?P<destination_port>\d+)?\b(?P<destination_options>\w)?(?=via|\Z|//)',
"Myfirewall add 50750 set Mycounter allow udp from 123.45.67.89/28 to Mynic opt1 opt2,opt3 via someotheriface"
)
In both cases, there is no match. I would expect regex.group("destination_port")=="123" and regex2.group("destination_options")=="opt1 opt2,opt3" .
What I currently can extract: all required fields up to the keyword "to" (not shown here, LMK if relevant). What I am still struggling with:
capturing the string between "to" and "via", comment start (//) or newline
deciding whether it is a constant (IPv4) or variable (string), this is the main part
separating the main part from secondary parts - ports or options
If a regex for this task is too complicated, I am open to alternative solutions. I have used several other issues to build my regex so far.
Python regex to match IP-address with /CIDR
Python regex capture whole integer
(Python) Regex to extract network-object group from Cisco config