2

I am trying to find a way to list only the names that have an accent.

I am using the following example taken from this question https://dba.stackexchange.com/questions/94887/what-is-the-impact-of-lc-ctype-on-a-postgresql-database

select firstname from (values ('bernard'), ('bérénice'), ('béatrice'), ('boris')) 
 AS l(firstname)
order by firstname collate "C";

Current output is

firstname
-------
bernard
boris
béatrice
bérénice

Expected output is

firstname
-------
béatrice
bérénice

Any ida what I should put in the where statement?

Tito
  • 601
  • 8
  • 23

2 Answers2

2

You have to create an extension first:

CREATE EXTENSION unaccent;

Then you can check with this query:

SELECT l.firstname 
FROM (VALUES ('bernard'), ('bérénice'), ('béatrice'), ('boris')) AS l(firstname)
WHERE l.firstname <> unaccent(l.firstname);
Tony
  • 604
  • 2
  • 6
  • 17
2

It's possible this will yield more than you ask for, but if you want to find all records that contain Unicode characters, you can use a regular expression for "not ASCII":

select firstname
from (values ('bernard'), ('bérénice'), ('béatrice'), ('boris')) 
 AS l(firstname)
where
  firstname ~ '[^[:ascii:]]'

Again, this will include more than just accents, but depending on your use case it might meet the need.

Hambone
  • 15,600
  • 8
  • 46
  • 69
  • This is better than the rest since I am required to create an extension and my role is SQL developer not admin which wont allow me to adjust or add any db extensions! This was spot on exactly what I wanted since im only looking at a list of name and it spotted them all for me!# – Tito Nov 26 '18 at 15:20