How to use flags with python's regular expressions in string notation?

Question

I need to use PLY for a parser, and PLY forces you to write regular expressions in string ntoation inside a token definition. For example (from the docs):

# A regular expression rule with some action code
def t_NUMBER(t):
    r'\d+'
    t.value = int(t.value)
    return t

# Define a rule so we can track line numbers
def t_newline(t):
    r'\n+'
    t.lexer.lineno += len(t.value)

I was wondering how to write more complex regular expressions using this notation, specifically how to use the IGNORECASE flag for the RE.

you are writing the expression `r'\d+'`. This essentially does nothing. You are not assigning this to any variable or using it in any way. Please provide an example of what pattern of characters you are looking for. `\d` means something in regular expression (`import re`), `r'\d+'` is different than `r'\d?'` and different that `r'\d+*'` — hussam, Dec 05 '20 at 21:11
That's an example from the docs (https://ply.readthedocs.io/en/latest/ply.html#parsing-basics). PLY works this way. I need to use the ignore case letter with a very simple pattern: the letter 'a' or A, but I want to use the flag specifically because I want to test it to use it later with more complex patterns. — Werner Germán Busch, Dec 05 '20 at 21:13
so for a or A you can do `r"[a-A]"`, or you can add another directive: `r"[a-A]+"`. `*` Causes the resulting RE to match 0 or more repetitions of the preceding RE. `+` Causes the resulting RE to match 1 or more repetitions of the preceding RE. `?` Causes the resulting RE to match 0 or 1 repetitions of the preceding RE — hussam, Dec 05 '20 at 21:17
So there is no way at all to use ignore case flag? I want to use that flag specifically, I don't want a workaround. — Werner Germán Busch, Dec 05 '20 at 21:20

score 2 · Accepted Answer · answered Dec 05 '20 at 21:22

2

To enable the ignore case flag, you can use (?i) at the beginning of the regular expression. For example:

def t_id(t):
  r'(?i)[a-z_][a-z0-9_]+'

answered Dec 05 '20 at 21:22

sepp2k

363,768
54
674
675

do you have to use is like this `r'^(?i)[a-z_][a-z0-9_]+'` so that it tags the beginning of the statement. – hussam Dec 05 '20 at 21:24
Thank you this is exactly whay I wanted to know thank you :) – Werner Germán Busch Dec 05 '20 at 21:25
@hussam You don't have to anchor regular expressions in PLY. – sepp2k Dec 05 '20 at 21:26

How to use flags with python's regular expressions in string notation?

1 Answers1