0

When I run the below 4 lines of code I dont get the same result from all 4. Why is the last line not finding a match?

grep("CPA's", c("CPA's"))
agrep("CPA's", c("CPA's"))

grep("CPA'?s?", c("CPA's"))
agrep("CPA'?s?", c("CPA's"))

I haven't yet done my full reading on the fuzzy matching functions, but on the face of it I don't see why this would be an issue.

oguz ismail
  • 1
  • 16
  • 47
  • 69
BeerSharkBot
  • 171
  • 4

1 Answers1

2

Because the difference is more than the default max.distance which is 0.1. Increase the max.distance and it will capture it.

agrep("CPA'?s?", "CPA's", max.distance = 0.15)
#[1] 1

To treat pattern as regular expression, select fixed = FALSE which will then work directly

agrep("CPA'?s?", "CPA's", fixed = FALSE)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • I'm an idiot. I didn't realize the default for the fixed argument to agrep was true. I thought agrep would take that as a regular expression and treat the two strings then as matches with no difference. – BeerSharkBot Dec 10 '19 at 01:25
  • yes, you can also use `agrep("CPA'?s?", c("CPA's"), fixed = FALSE)` in which case it would be treated as regular expression. – Ronak Shah Dec 10 '19 at 01:26