How to get only Cyrillic characters from a PostgreSQL table?

Question

I would to find all the rows of a PostgreSQL table which have Cyrillic characters. I tried to use this query: SELECT * FROM "items" WHERE (title SIMILAR TO '%[\u0410-\u044f]%'), which I take here: Find all rows using some Unicode range (such as Cyrillic characters) with PostgreSQL?.

It seems to work, but other then the Cyrillic values, I also get some Latin values. How is it possible? I think that maybe, even if I'm writing Latin letters, if I use a keyboard with Cyrillic characters, some of them could be read as Cyrillic.

Anyway, I'm using this DB on a Java project. Does exist a more efficient solution via code?

Thank you

zappee · Answer 1 · 2018-04-19T09:04:50.760

0

You need to use reqular expression. The function name in PostgreSQL is REGEXP_MATCHES.

Documentation:

Another solution is to use the true/false operator: ~.

Documentation: Using regexps in PostgreSQL

edited Apr 19 '18 at 09:04

answered Apr 19 '18 at 08:54

zappee

20,148
14
73
129

score 0 · Answer 2 · answered Apr 19 '18 at 09:13

0

use the same mask:

t=# select regexp_replace('pol 398Родное Луговое abc 123','[^\u0410-\u044f]','','g');
 regexp_replace
----------------
 РодноеЛуговое
(1 row)

answered Apr 19 '18 at 09:13

Vao Tsun

47,234
13
100
132

Now it's a bit better, but the problem is that when I have for example a string containing both Latin (or numbers) and Cyrillic characters, I will obtain as result just the part in Cyrillic. What I would, is the entire string. – salvo9415 Apr 19 '18 at 09:35

How to get only Cyrillic characters from a PostgreSQL table?

2 Answers2