7

I'm searching for some widely extended dialect (like this one https://github.com/vmeurisse/wildmatch + globstar **) described with BFN rules.

In any format or language. OMeta or PEG would be great.

Ev_genus
  • 135
  • 11

1 Answers1

2

I'm not sure to understand your question since the grammar for file path wildcard can be reduced to a simple regular expression. This grammar is defined by the Unix Shell.

You can find the BNF for Bash here: http://my.safaribooksonline.com/book/operating-systems-and-server-administration/unix/1565923472/syntax/lbs.appd.div.3

In Python programming language, a definition of the glob.glob() function is available in the documentation. This function use the fnmatch.fnmatch() function to perform the pattern matching. The documentation is available here: https://docs.python.org/2/library/fnmatch.html#fnmatch.fnmatch.

The fnmatch.fnmatch function translate a file path wildcard pattern to a classic regular expression, like this:

def translate(pat):
    """Translate a shell PATTERN to a regular expression.

    There is no way to quote meta-characters.
    """

    i, n = 0, len(pat)
    res = ''
    while i < n:
        c = pat[i]
        i = i+1
        if c == '*':
            res = res + '.*'
        elif c == '?':
            res = res + '.'
        elif c == '[':
            j = i
            if j < n and pat[j] == '!':
                j = j+1
            if j < n and pat[j] == ']':
                j = j+1
            while j < n and pat[j] != ']':
                j = j+1
            if j >= n:
                res = res + '\\['
            else:
                stuff = pat[i:j].replace('\\','\\\\')
                i = j+1
                if stuff[0] == '!':
                    stuff = '^' + stuff[1:]
                elif stuff[0] == '^':
                    stuff = '\\' + stuff
                res = '%s[%s]' % (res, stuff)
        else:
            res = res + re.escape(c)
    return res + '\Z(?ms)'

That can help you to write de BNF grammar...

EDIT

Here is a very simple grammar:

wildcard : expr
         | expr wildcard

expr : WORD
     | ASTERIX
     | QUESTION
     | neg_bracket_expr
     | pos_bracket_expr

pos_bracket_expr : LBRACKET WORD RBRACKET

neg_bracket_expr : LBRACKET EXCLAMATION WORD RBRACKET

A list of popular grammars parsed by the famous ANTLR tool is available here: http://www.antlr3.org/grammar/list.html.

Laurent LAPORTE
  • 21,958
  • 6
  • 58
  • 103
  • `the grammar for file path wildcard can be reduced to a simple regular expression` well, actually yes. There is the way to write regular expression to transform pattern into another regular expression that can match path. But this solution lack error handling in pattern. Also I need rich grammar implementation to make my own dialect. – Ev_genus Sep 19 '14 at 22:12