2

am trying to figure out how to use this nifty lib to parse BigIP config files... the grammar should,be something like this:

stanza :: name { content }
name   :: several words, might contain alphas nums dot dash underscore or slash
content:: stanza OR ZeroOrMore(printable characters)

To make things slightly more complicated, one exception:

If name starts with "rule ", then content cannot be "stanza"

I started with this:

from pyparsing import *
def parse(config):
    def BNF():
        """
        Example:
        ...
        ltm virtual /Common/vdi.uis.test.com_80_vs {
            destination /Common/1.2.3.4:80
            http-class {
                /Common/http2https
            }
            ip-protocol tcp
            mask 255.255.255.255
            profiles {
                /Common/http { }
                /Common/tcp { }
            }
            vlans-disabled
        }
        ...
        """        
        lcb, rcb, slash, dot, underscore, dash = [c for c in '{}/._-']
        name_word = Word(alphas + nums + dot + underscore + slash + dash)
        name = OneOrMore(name_word).setResultsName("name")
        stanza = Forward()
        content = OneOrMore(stanza | ZeroOrMore(OneOrMore(Word(printables)))).setResultsName("content")
        stanza << Group(name + lcb + content + rcb).setResultsName("stanza")
        return stanza


    return [x for x in BNF().scanString(config)]

The code above seems to lock up in some infinite loop. It is also missing my requirement for excluding looking for 'stanza" if "name" starts with "rule ".

lrhazi
  • 53
  • 1
  • 8

1 Answers1

3

OneOrMore(ZeroOrMore(OneOrMore(Word(printables))) will always match, thus leading to the infinite loop.

Also, printables includes a closing curly bracket, which gets consumed by the content term, and is no longer available for the stanza. (If your content can including a closing bracket, you need to define something to escape it, to distinguish a content bracket from a stanza bracket.)

To address the name rule, you need another content definition, one that doesn't include stanza, and a "rule rule".

def parse(config):
    def BNF():
        lcb, rcb, slash, dot, underscore, dash = [c for c in '{}/._-']
        printables_no_rcb = Word(printables, excludeChars=rcb)
        name_word = Word(alphas + nums + dot + underscore + slash + dash)
        name = OneOrMore(name_word).setResultsName("name")
        rule = Group(Literal('rule') + name).setResultsName("name")
        rule_content = OneOrMore(printables_no_rcb).setResultsName("content")
        stanza = Forward()
        content = OneOrMore(stanza | OneOrMore(printables_no_rcb)).setResultsName("content")
        stanza << Group(rule + lcb + rule_content + rcb | name + lcb + content + rcb).setResultsName("stanza")
        return stanza
    return [x for x in BNF().scanString(config)]
Daniel 'Dang' Griffith
  • 1,641
  • 2
  • 13
  • 13