1

What's the problem with CHAR(13) or perhaps CHAR(14) in TSQL patindex? As soon as I include CHAR(14) in a pattern, I get no records found. Searching for an answer, I just found my own question (unanswered) from 2009 (here: http://www.sqlservercentral.com/Forums/Topic795063-338-1.aspx).

Here is another simple test, to show what I mean:

/* PATINDEX TEST */
DECLARE @msg NVARCHAR(255)
SET @msg = 'ABC' + NCHAR(13) + NCHAR(9) + 'DEF'

DECLARE @unwanted NVARCHAR(50)
-- unwanted chars in a "chopped up" string
SET @unwanted = N'%[' + NCHAR(1) + '-' + NCHAR(13) + NCHAR(14) + '-' + NCHAR(31) + ']%'
SELECT patindex(@unwanted, @msg)

-- Result: 4

-- NOW LET THE unwanted string includ the whole range from 1 to 31


   SET @unwanted = '%['+NCHAR(1)+'-'+NCHAR(31)+']%' -- -- As soon as Char(14) is included, we get no match with patindex!
    SELECT patindex(@unwanted, @msg)

-- Result: 0

Sparky
  • 14,967
  • 2
  • 31
  • 45
Andreas Jansson
  • 830
  • 1
  • 9
  • 21

1 Answers1

6

It is permitted.

You need to bear in mind that the ranges are based on collation sort order not character codes however so perhaps in your default collation it sorts in a position that you do not expect.

What is your database's default collation?

What does the following return?

;WITH CTE(N) AS
(
SELECT 1 UNION ALL
SELECT 9 UNION ALL
SELECT 13 UNION ALL
SELECT 14 UNION ALL
SELECT 31 
)
SELECT N 
FROM CTE
ORDER BY NCHAR(N)

For me it returns

N
-----------
1
14
31
9
13

So both characters 9 and 13 are outside the range 1-31. Hence

'ABC' + NCHAR(13) + NCHAR(9) + 'DEF' NOT LIKE N'%['+NCHAR(1)+N'-'+NCHAR(31)+N']%'

Which explains the results in your question. Character 14 doesn't enter into it.

You can use a binary collate clause to get it to sort more as you were expecting. e.g.

SELECT patindex(@unwanted COLLATE Latin1_General_100_BIN, @msg)

Returns 4 in the second query too.

Martin Smith
  • 438,706
  • 87
  • 741
  • 845
  • My collation is Finnish_Swedish_CI_AS, and your CTE returned the result in the same order for me. I had never imagined that the characters lower than 32 could be in a different order, and depending on collation! Thank you! – Andreas Jansson Jan 23 '12 at 08:15