Our Oracle DB is UTF8. We are storing addresses that need to be searchable. Some of the street names contain non-english characters (e.g. Peña Báináõ ) this needs to be searchable either as "Peña Báináõ" or with english equivalent charactes like "Pena Bainao". What we did is to convert the text on the query, something like:
SELECT CONVERT('Peña Báináõ','US7ASCII') as converted FROM dual;
But the issue here is that not all of the characters have an English equivalent (not even some pretty obvious ones like ñ or õ) so we end up with the text converted to:
Pe?a Baina?
So if the user tries to find that addres typing "Pena Bainao" he can't find it because "Pena Bainao" is different from ""Pe?a Baina?"".
We have figured out some dirty workarrounds on this, but I wanted to check first if someone has found a more elegant solution.
Here is a list of some characters that are not converted to US7ASCII:
Character UTF8 Code Possible Equivalent
æ - u00E6 - ae
å - u00E5 - a
ã - u00E3 - a
ñ - u00F1 - n
õ - u00F5 - o