2

Hi I am trying to find all rows in my database (SQL Server) which have character é in their text by executing the following queries.

SELECT COUNT(*) FROM t_question WHERE patindex(N'%[\xE9]%',question) > 0;

SELECT COUNT(*) FROM t_question WHERE patindex(N'%[\u00E9]%',question) > 0;

But I found two problems: (a) Both of them are returning different number of rows and (b) They are returning rows which do not have the specified character.

Is the way I am constructing the regular expression and comparing the Unicode correct?

EDIT:

The question column is stored using datatype nvarchar. The following query gives the correct result though.

SELECT COUNT(*) FROM t_question WHERE question LIKE N'%é%';
rkg
  • 5,559
  • 8
  • 37
  • 50

1 Answers1

5

Why not use SELECT COUNT(*) FROM t_question WHERE question LIKE N'%é%'?

NB: Likeand patindex do not accept regular expressions.

In the SQL Server pattern syntax [\xE9] means match any single character within the specified set. i.e. match \, x, E or 9. So any of the following strings would match that pattern.

  • "Elephant"
  • "axis"
  • "99.9"
Martin Smith
  • 438,706
  • 87
  • 741
  • 845
  • So the first one is matching with \,0,0,E or 9? Is there a way to actually specify the unicode in the query? – rkg Nov 30 '10 at 23:08
  • @Ravi - You could use `WHERE patindex(N'%' + NCHAR(0xE9) + '%',question) > 0` if for some reason you don't want to just use the literal `WHERE patindex(N'%é%',question) > 0` – Martin Smith Nov 30 '10 at 23:27
  • Thanks a bunch Martin. This really helped. I was under the impression that PATINDEX supports the normal regular expressions. – rkg Nov 30 '10 at 23:40