Python PLY Lex ambiguity

Question

I have a problem with ambiguity on tokens level.

The problem looks like this. My code looks like this so token t_UN1 has higher precedence.

t_ignore = ' \t\v\r' # whitespace 

....

def t_UN1(t): #NS_
    r'NS\_'
    return t
def t_IDENTIFIER(t):
    r'[a-zA-Z][a-zA-Z0-9_]*'
    return t

....

I would like to achieve that eg. string: NS_XYZ is identified as "IDENTIFIER" and single NS_ surrounded by white spaces is identified as "UN_1".

How shall I handle that ? Currently string NS_XYZ is simply splited into two tokens UN1 and IDENTIFIER

score 1 · Accepted Answer · answered Mar 11 '13 at 15:16

1

If you're looking to get 'single NS_ surrounded by white spaces', you can add the white space character class into your token string:

def t_UN1(t): #NS_
    r'\s+NS\_\s+'
    return t

Side note: for PLY questions, the ply-hack google group is a good place to ask PLY-related questions.

answered Mar 11 '13 at 15:16

MichaelJCox

756
7
17

1

ply-hack tells me I don't have permission to post *despite having joined* the group. I have tried both web interface and email. – Honza Jul 29 '13 at 13:03

Python PLY Lex ambiguity

1 Answers1