0

I'm trying to parse a Nagios / Icinga config so I can do further processing on it with Python. Since I could not find a working library to do that (pynag does not seem to work at all), I'm trying to write a simple Python script using regexes to do so.

Basically I want to get from this configfile (it uses tabs for indentation):

define host {
    address 123.123.123.123
    passive_checks_enabled  1
    }

define service {
    service_description Crondaemon
    check_command   check_nrpe_1arg!check_crondaemon
    }

to something like this Python tuple:

(
 ('host', ('address', '123.123.123.123'), ('passive_checks_enabled', '1')), 
 ('service', ('service_description', 'Crondaemon'), ('check_command', 'check_nrpe_1arg!check_crondaemon'))
)

This is my full script with parsing logic including an example to test:

import re

# white spaces are tabs!
TEST_STR = """
define host {
    address 123.123.123.123
    passive_checks_enabled  1
    }

define service {
    service_description Crondaemon
    check_command   check_nrpe_1arg!check_crondaemon
    }
"""

cfg_all_regex = re.compile(
    r'define\s+(\w+)\s*\{'
    '(.*?)'
    '\t}',
    re.DOTALL
)
# basic regex works
print(re.findall(cfg_all_regex, TEST_STR))

cfg_all_regex = re.compile(
    r'define\s+(\w+)\s*{\n'
    '(\t(.*)?\t(.*)?\n)*'
    '\t}',
    re.DOTALL
)
# more specific regex to extract all key values fails
print(re.findall(cfg_all_regex, TEST_STR))

Unfortunately I cannot get the full parsing to work, it always matches everything or nothing. Can you please give me a hint how to fix my regex so I can extract all key value pairs from my Icinga config?

Wolkenarchitekt
  • 20,170
  • 29
  • 111
  • 174
  • pynag is actually working, but the [page](https://pynag.org) you linked seems outdated and unmaintained. It links to [github repo](https://github.com/pynag/pynag), where you can get a latest release. – Ashark Nov 22 '21 at 12:56

1 Answers1

1

re module doesn't support repeated captures, so

'(\t(.*)?\t(.*)?\n)*'

only preserves last group capture.

Likewise I would transform this like that

'\t(\w+)\s+([^\n]*)\n\'

So a possible solution, given the structure of your data, can be creates a regular expression that will match either pattern:

regex = r'define\s+(\w+)\s+\{\n|\t(\w+)\s+([^\n]*)\n|\t\}'
matches = re.finditer(regex, TEST_STR, re.DOTALL)

With a for loop you can iterate over the groups

for match in matches:
    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum + 1
        if match.group(groupNum):
            print("Group {}: {}".format(groupNum, match.group(groupNum)))

return:

Group 1: host
Group 2: address
Group 3: 123.123.123.123
Group 2: passive_checks_enabled
Group 3: 1
Group 1: service
Group 2: service_description
Group 3: Crondaemon
Group 2: check_command
Group 3: check_nrpe_1arg!check_crondaemon
alvarez
  • 456
  • 3
  • 9
  • Wow did not expect that this cannot be solved with a simple regex. But your solution works like a charm and I could complete my parser. Final logic can be found here: https://gist.github.com/ifischer/6e8aa105c5f644fd3803f8b41dcbe4f3 Thank you so much for your help, saved me lots of time fiddling! – Wolkenarchitekt Aug 09 '17 at 14:32
  • 1
    Just found out that [regex](https://pypi.python.org/pypi/regex/) supports repeated captures. Maybe with that the solution could be simplified a lot. But not worth the efforts for now – Wolkenarchitekt Aug 09 '17 at 14:39
  • You're right. If install third-party libraries is viable in your project, regex module is a better option. – alvarez Aug 09 '17 at 16:53