0

This simple example shows the issue I've run into, but I don't understand why...

I'm testing for the location of the first character that is either a lower or upper case letter, a single dash, or a period in a string parameter passed to me.

These two pattern matches appear to check the same thing, and yet run this code yourself and it will print a 0 then a 3:

PRINT PATINDEX ( '%[a-z,A-Z,-,.]%', '16-82') 
PRINT PATINDEX ( '%[-,a-z,A-Z,.]%', '16-82') 

I don't understand why it works only if the dash character is the first one we check for.

Is this a bug? Or working as designed and I missed something... I'm using SQL Server 2016, but I don't think that matters.

Brian B
  • 1,509
  • 3
  • 20
  • 29

1 Answers1

5

A dash within a character group may play either of the two roles:

  • It may denote the dash itself, like it does in the expression [-abc]
  • It may denote the "everything inbetween" operator, like it does in the expression [a-z].

In your particular example, the character group [a-z,A-Z,-,.] denotes the following:

  • Everything from a to z
  • Comma ,
  • Everything from A to Z
  • Everything from , to , (i.e. just the comma again).
  • Dot .

In fact, you probably wanted to write [-a-zA-Z.]

KT.
  • 10,815
  • 4
  • 47
  • 71
  • Excellent explanation. So the dash can go at the beginning or at the end for sure. The language should just let you escape it to make it crystal clear you mean the character and not the operator, IMO. – Brian B May 24 '18 at 00:15