0

I've a problem with the Python regex fuzzy search.

This is working:

import regex
s = '2991  Nixon Avenue Chattanooga Tennessee'
regex.search(r"(?msi)(?=.*\bnixon\b)(?=.*\bchattanooga\b)",s)

This is not working (removed a t from Chattanooga): result None

import regex
s = '2991  Nixon Avenue Chatanooga Tennessee'
regex.search(r"(?msie)(?=.*\bnixon\b)(?=.*\bchattanooga\b){e=<3}",s)

What am I doing wrong here?
It looks like it's something with the positive lookahead and the word bounderies.

Note: This is just a simple example to get it working. I reality is the part of a more complex job.

Aside, do i need to specify the fuzziness per regex item (nixon, chattanooga) or is it possible to do it for both at the same time e.g. ((?=.*\bnixon)(?=.*\bchattanooga\b)){e=<3}

John Doe
  • 9,843
  • 13
  • 42
  • 73

1 Answers1

0

I was applying the fuzziness to the lookahead itself instead of to its contents.

If it's "Chattanooga" that's fuzzy, do:

regex.search(r"(?msie)(?=.*\bnixon\b)(?=.*\b(?:chattanooga){e<=3}\b)",s)
John Doe
  • 9,843
  • 13
  • 42
  • 73