14

I'm running the following select statements on MySQL 5.0.88 with utf8 charset and utf8_unicode_ci collation:

SELECT * FROM table WHERE surname = 'abcß';

+----+-------------------+------+
| id | forename    | surname    |
+----+-------------------+------+
|  1 | a           | abcß       |
|  2 | b           | abcss      |
+----+-------------+------------+

SELECT * FROM table WHERE surname LIKE 'abcß';

+----+-------------------+------+
| id | forename    | surname    |
+----+-------------------+------+
|  1 | a           | abcß       |
+----+-------------+------------+

According to http://dev.mysql.com/doc/refman/5.0/en/charset-unicode-sets.html the german special char ß = ss for utf8_unicode_ci, but why does it only work with the "=" operator and not with LIKE? I have a phone book application and I desperately need both things working together.

ObscureRobot
  • 7,306
  • 2
  • 27
  • 36
sub
  • 157
  • 1
  • 5
  • what does `show variables like '%collation%'` and `show full fields from yourtable` show? The collation has to be identical throughout. – Marc B Oct 27 '11 at 14:30
  • the collation for all varchar fields of the table is set to "utf8_unicode_ci" + some int fields with no collation at all. – sub Oct 27 '11 at 14:55
  • 2
    +1 very interesting question, love to hear the answer. – Johan Oct 27 '11 at 14:59
  • collation_connection is "utf8_unicode_ci", collation_database is "utf8_unicode_ci", collation_server is "latin1_swedish_ci". However, I thought, that the server collation is overridden by the database collation, otherwise I cant explain why it works with the "=" operator – sub Oct 27 '11 at 15:04

1 Answers1

14

Per the SQL standard, LIKE performs matching on a per-character basis, thus it can produce results different from the = comparison operator:

mysql> SELECT 'ä' LIKE 'ae' COLLATE latin1_german2_ci;
+-----------------------------------------+
| 'ä' LIKE 'ae' COLLATE latin1_german2_ci |
+-----------------------------------------+
|                                       0 |
+-----------------------------------------+
mysql> SELECT 'ä' = 'ae' COLLATE latin1_german2_ci;
+--------------------------------------+
| 'ä' = 'ae' COLLATE latin1_german2_ci |
+--------------------------------------+
|                                    1 |
+--------------------------------------+

Source: http://dev.mysql.com/doc/refman/5.0/en/string-comparison-functions.html#operator_like

Karolis
  • 9,396
  • 29
  • 38
  • 1
    Thanks Karolis. Seems I cant have both, which kind of sucks for a large phone book application with lots of special chars. Maybe the mysql FULLTEXT search could be an alternativ. – sub Oct 27 '11 at 15:47