1

I want to match any line that does not end with 'CA' or 'CA[any number]'. How can I do that using rlike in MySQL? (Note it doesn't support ?! etc).

Here's the regex for a positive match, I just need a way to negate it: '^.*[C][A][0-9]?$'

(Due to an embarrassing architecture limitation, I don't want to use not rlike ...)

Unihedron
  • 10,902
  • 13
  • 62
  • 72
Greg
  • 45,306
  • 89
  • 231
  • 297
  • "not rlike" is the way to due this, why exactly don't you want to do this? – Robert Gamble Dec 02 '08 at 18:20
  • His 'n' key is broken. It's very embarrassing. ;-) – Bill Karwin Dec 02 '08 at 18:26
  • Well the regex is coming from a column in a table, and most of the patterns there already work. So it would easiest if I could make this pattern work within the existing structure. – Greg Dec 02 '08 at 18:56
  • There's nothing wrong with wanting to avoid 'not rlike'. Ideally, you want one regex that matches exactly what you want and excludes what you don't. Unfortunately, MySQL did not choose to implement negation operator ?!, so you often can't reduce it to one simple expression. If I need an exclusion, I usually have to match against 2 regex columns, one that I RLIKE, and another that I NOT RLIKE against to make sure exclusions are honored. It's really dumb. – Kasey Speakman Dec 08 '11 at 23:34

2 Answers2

6

The trick is turning it into a description of what you do want to match. Something doesn't end in CA or CA(digit) when:

It ends with something other than A or digit, or

It ends with something other than A, then a digit, or

It ends with something other than C, then A, then a digit, or

It ends with something other than C, then A, or

It equals A followed by a digit, or

It equals A or a digit, or

It is empty.

So:

rlike '[^A0-9]$|[^A][0-9]$|[^C]A[0-9]$|[^C]A$|^A[0-9]$|^[A0-9]$|^$'

Untested, un-"optimized", probably at least one error somewhere in the above.

ysth
  • 96,171
  • 6
  • 121
  • 214
  • Here is a condensed version that should work: "([^A\d]|[^A]\d|[^C]A\d)$". Excellent spadework, ysth! +1 – Tomalak Dec 03 '08 at 21:29
2

Well the regex is coming from a column in a table, and most of the patterns there already work. So it would easiest if I could make this pattern work within the existing structure.

I did something similar once, what I did was create another column (I think I used a bitfield) that contained options for the regular expression (case-insensitivity, anchoring, negation, etc). A similar approach might work for you.

Robert Gamble
  • 106,424
  • 25
  • 145
  • 137
  • Yeah, looks like that's probably the way to go. I was hoping someone would get creative with [^C][^A] etc. But maybe it's impossible. – Greg Dec 02 '08 at 19:04