How can I access the entire matched string in Python Parsley?

Question

I'm using OMeta and Python Parsley (http://parsley.readthedocs.org/) for parsing. Is there a way to access the string matched by a specific rule?

For example, consider this code:

In [1]: import parsley
In [2]: g = parsley.makeGrammar('addr = <digit+>:num ws <anything+>:street -> {"num": num, "street": street}', {})
In [3]: g('15032 La Cuarta St.').addr()
Out[3]: {'num': '15032', 'street': 'La Cuarta St.'}

I'm looking for a way to refer to entire match to return something like:

{'match': '15032 La Cuarta St.', 'num': '15032', 'street': 'La Cuarta St.'}

The following code works for this purpose:

g = parsley.makeGrammar('addr = <<digit+>:num ws <anything+>:street>:match -> {"num": num, "street": street, "match": match}', {})

However, I have hundreds of rules and I'd like to avoid wrapping each one.

Not a criticism, but more for my understanding of your question, but why are you using Python Parsley instead of Regular Expressions? — William Denman, Nov 25 '13 at 09:47
Regexps are much harder to construct and maintain for complicated structures. (My actual use case is parsing textual representations of genomic variants.) Please look at the Parsley docs for examples. — Reece, Nov 25 '13 at 09:52
Understood. Parsley looks like a great module and I have had fun learning about it. Looks like there is no simple alternative to the method you suggest. Is your only acceptable solution not to have to modify your grammar rules in absolutely any way? — William Denman, Nov 25 '13 at 12:02

score 0 · Answer 1 · answered Nov 25 '13 at 13:12

Here's how I would do it, if I didn't want to modify many rules by hand. It uses your solution, but does not require manual edit of the rules. You could monkey patch this in if you take a look at the source of https://github.com/python-parsley/parsley/blob/master/parsley.py, then you wouldn't even have to add a call to myMakeGrammar.

import parsley
import re

prog = re.compile('([a-z]* =) (.*) -> {(.*)}')

def myMakeGrammar(grammar):
    gs = prog.search('addr = <digit+>:num ws <anything+>:street -> {"num": num, "street": street}')
    new_grammar = gs.group(1) + '<' + gs.group(2) + '>:match -> {' + gs.group(3) + ',"match": match}'

return new_grammar

g = parsley.makeGrammar(myMakeGrammar('addr = <digit+>:num ws <anything+>:street ->    {"num": num, "street": street}'), {})
print g('15032 La Cuarta St.').addr()

This returns,

{'num': '15032', 'street': 'La Cuarta St.', 'match': '15032 La Cuarta St.'}

How can I access the entire matched string in Python Parsley?

1 Answers1