0

related question: Case insensitive replace

What's the best way to do a case insensitive replace WITHOUT HURTING THE CACHE in the re module? I'm monitoring carefully the cache to make sure my favorite regexes stay there (speed, of course).

I just notice that my code:

ner_token_result = re.sub('(?i)'+leftover, corrected_word, ner_token_result)

is re.compiling every time it is run. leftover is dynamic (based on user input).

I like regular expressions (fast, I can read them) but I don't want to hurt my cache.

I don't want to use a caseless string class...

I don't want the ugliness of converting to lowercase, replacing and restoring case...

Please help?

Community
  • 1
  • 1
Tal Weiss
  • 8,889
  • 8
  • 54
  • 62
  • what is *hurting cache* is supposed to mean? Why is it about case-insensitive replace, and not any other kind of matching operation? – SilentGhost Jul 07 '10 at 09:54
  • All my other regular expressions are pre-compiled (and stored in the re cache). I can't do that when I want to do a caseless replace since the input is dynamic. I'll accept a good answer that does not use regular expressions for this problem, or one that uses regular expressions but without adding another compiled regular expression to the re cache. My application has continuous inputs and they will flush out my other regular expressions from the cache. – Tal Weiss Jul 07 '10 at 09:59

2 Answers2

2

If your other expressions are pre-compiled it means you did something like this:

regex = re.compile(leftover, re.I)

Which means you will be able to refer to regex regardless of cache overloading. If you didn't do this, do it for those regexes that need to be re-used throughout your code.

SilentGhost
  • 307,395
  • 66
  • 306
  • 293
2

Obviously the dynamic regex needs to be compiled each time leftover changes. Are you worried that this is pushing your other regexs out of the cache?

If so, simply compile the other regexs you are using with re.compile

John La Rooy
  • 295,403
  • 53
  • 369
  • 502