3

I'm facing a problem in regular expressions case sensitivity. In a regular expression expression only the first item works within the square brackets.

create table
(
    FlowerId varchar(7)

    constraint chk_flid_regex 
    check(ActorId like'[A-Z][a-z][A-Z]' collate sql_latin1_general_CP1_CS_AS)
);

Eg. If i give something like this '[A-Z][a-z][A-Z]' Only the first [A-Z] is checked by the server. The third [A-Z] is not checked.

So if I insert values like 'Abc' gets inserted while it shouldn't actually get inserted into the table. Whereas, it doesn't give any error. It should only accept characters like 'AbC'.

Solomon Rutzky
  • 46,688
  • 9
  • 128
  • 171
Devi
  • 86
  • 4
  • That is kinda funky. I would have expected the case-sensitivy collation would have made it work. Another way you could also do it is build a check statement on the three characters your care about and make sure individually they're within the allowable characters. something like ascii(substring(actorId, 1, 1)) between 97 and 122 – Xedni Feb 13 '15 at 03:23

3 Answers3

3

Though I can't explain why the CS collation isn't working, switching to a binary collation seems to behave as you are expecting, at least with Sql Server 2008:

create table tbl
(
     FlowerId varchar(7)
constraint chk_flid_regex 
check(FlowerId like'[A-Z][a-z][A-Z]' collate Latin1_General_BIN)
);

Sql Fiddle

lheria
  • 581
  • 2
  • 4
  • 1
    Yes, using a binary collation (`_BIN` or `_BIN2`) is the way to solve this. The reason has to do with the concept of dictionary ordering as opposed to strict equality when filtering on ranges. I have provided a thorough explanation in this answer: http://stackoverflow.com/questions/26176370/why-does-filtering-on-a-range-match-the-wrong-case-when-using-a-case-sensitive-c/26185647#26185647 +1. And please see the example I am leaving in a comment on Damien's answer. – Solomon Rutzky Feb 18 '15 at 17:12
1

You can't use the range form, you have to expand out all the letters you want to use:

create table
(
    FlowerId varchar(7)

    constraint chk_flid_regex 
    check(ActorId like'[ABCDEFGHIJKLMNOPQRSTUVWXYZ][abcdefghijklmnopqrstuvwxyz][ABCDEFGHIJKLMNOPQRSTUVWXYZ]' collate sql_latin1_general_CP1_CS_AS)
);

Or switch to a binary collation. Why? Because in most collations, the lower case letters are placed between the upper case letters, either before or after their upper case equivalents. So the range A-Z expands out as AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZ - all you've excluded as a letter is lower-case z.


(I hope I've got my alphabet right each time I've had to type it out above, but you might want to check carefully that I've not missed out any letters)

Damien_The_Unbeliever
  • 234,701
  • 27
  • 340
  • 448
  • Well, yes and no ;-). The SQL Server collation you are using actually sorts capitals first. But, the windows collation of "Latin1_General_CS_AS" does sort lower-case first. Test with the following: `SELECT 1 WHERE 'a' LIKE '[A-C]' COLLATE SQL_Latin1_General_CP1_CS_AS; SELECT 2 WHERE 'a' LIKE '[ABC]' COLLATE SQL_Latin1_General_CP1_CS_AS; SELECT 3 WHERE 'A' LIKE '[a-c]' COLLATE SQL_Latin1_General_CP1_CS_AS; SELECT 4 WHERE 'a' LIKE '[A-C]' COLLATE Latin1_General_CS_AS; SELECT 5 WHERE 'A' LIKE '[a-c]' COLLATE Latin1_General_CS_AS;` You should only get "1" and "5" being returned. – Solomon Rutzky Feb 18 '15 at 17:20
  • 1
    Just re-read @srutzky's comment. I think it should be `AaBb...YyZ` for `sql_latin1_general_CP1_CS_AS`. (Because, as his tests show, `'a'` falls between `'A'` and `'C'`, so it can't go before `'A'`.) And while we are at it, your alphabet is fine, except the lower-case sequence's got an extra zed. – Andriy M Feb 25 '15 at 14:25
  • @AndriyM - cheers. It's amazing how you can look at these things tens or hundreds of times and still not spot the obvious. – Damien_The_Unbeliever Feb 25 '15 at 14:31
-1
^[A-Z][a-z][A-Z]$

Add anchors to disallow partial matches.

vks
  • 67,027
  • 10
  • 91
  • 124
  • 1
    This answer is entirely irrelevant to the question. The title was (until I just fixed it) admittedly misleading, but reading the question should have made it very clear that this is not about RegEx but instead about a SQL LIKE operator that does not support anchors. Though, even if this was a question about Regular Expressions, this answer would still be wrong given that the O.P. states that `Abc` is matching and this has nothing to do with partial matches. -1 – Solomon Rutzky Feb 18 '15 at 16:54