0

I would to find all the rows of a PostgreSQL table which have Cyrillic characters. I tried to use this query: SELECT * FROM "items" WHERE (title SIMILAR TO '%[\u0410-\u044f]%'), which I take here: Find all rows using some Unicode range (such as Cyrillic characters) with PostgreSQL?.

It seems to work, but other then the Cyrillic values, I also get some Latin values. How is it possible? I think that maybe, even if I'm writing Latin letters, if I use a keyboard with Cyrillic characters, some of them could be read as Cyrillic.

Anyway, I'm using this DB on a Java project. Does exist a more efficient solution via code?

Thank you

Vao Tsun
  • 47,234
  • 13
  • 100
  • 132
salvo9415
  • 93
  • 1
  • 14

2 Answers2

0

You need to use reqular expression. The function name in PostgreSQL is REGEXP_MATCHES.

Documentation:

Another solution is to use the true/false operator: ~.

Documentation: Using regexps in PostgreSQL

zappee
  • 20,148
  • 14
  • 73
  • 129
0

use the same mask:

t=# select regexp_replace('pol 398Родное Луговое abc 123','[^\u0410-\u044f]','','g');
 regexp_replace
----------------
 РодноеЛуговое
(1 row)
Vao Tsun
  • 47,234
  • 13
  • 100
  • 132
  • Now it's a bit better, but the problem is that when I have for example a string containing both Latin (or numbers) and Cyrillic characters, I will obtain as result just the part in Cyrillic. What I would, is the entire string. – salvo9415 Apr 19 '18 at 09:35