11

I have a JSF application running on tomcat6 in Fedora 17 using firebird as the database and all the registers coming from the database to the application are coming with a encoding problem.

The language is Brazilian portuguese so I need é's and ã's and ç and here all of these special characters come with problems.

The é's and ã's from the original source code are ok, only the ones coming directly from the database are causing me the trouble...

Any idea what is going on?

Heres a image where that weird character should be é

datatable with the problem

The problem happens when it recovers from the DB.

BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
Vitor Hugo
  • 1,106
  • 1
  • 15
  • 35
  • JSF 1.x or 2.x? Which "trouble" exactly are you talking about? Please provide more detail. Which characters are you seeing instead? At which step exactly is it showing wrong characters? Directly after retrieving from DB in Java code? Or only in the generated HTML output? – BalusC Oct 31 '12 at 11:16
  • 1
    What is the default character set of the DB (or the specific coumns), what is the connection characterset, is this data coming from a BLOB SUB_TYPE TEXT or a (VAR)CHAR? – Mark Rotteveel Oct 31 '12 at 11:24
  • tried to add more information – Vitor Hugo Oct 31 '12 at 11:47
  • 1
    You're still not clear on when exactly it fails. Please elaborate the problem in developer's perspective, not in enduser's perspective. To start, in the code *directly* after retrieving the data from the DB, put a debug breakpoint or a logger or a system.out.println so that you can investigate if the JDBC driver has decoded it properly. Note that you should make absolutely sure that your IDE and the logger/stdout console is by itself also using the right charset (i.e. you must be able to do `System.out.println("éãç")` and see it back as-is in the console). – BalusC Oct 31 '12 at 11:50
  • Note that I assume that the characters are properly been stored in the DB. So if you look in the DB directly using some DB admin tool, those characters should look just fine. Otherwise it didn't make sense to post this as a JSF problem in first place. As you're using JSF 2.x (which uses by itself already by default UTF-8 in all layers), I think more and more that the problem is actually in DB setup or JDBC driver config. – BalusC Oct 31 '12 at 11:52
  • Here, on my project, the data is stored correctly and rendered correctly without a problem. This is a TOMCAT problem and it only can be because everything else is OK unless when running on the default installation of Tomcat on the client linux server... Maybe I should remove the other tags – Vitor Hugo Oct 31 '12 at 11:55
  • If you did not specify a connection characterset, then Jaybird will use NONE, which means the bytes received from the server are converted using the Java default characterset. So if this is not the same as the default characterset on the other server then you can get different results. – Mark Rotteveel Oct 31 '12 at 16:29
  • thanks mark that helped me solve the problem... – Vitor Hugo Oct 31 '12 at 18:02
  • Victor, you keep failing to elaborate this problem in developer's perspective and you keep blaming Tomcat. In the future questions, please spend a bit more effort in debugging/logging/tracking. – BalusC Oct 31 '12 at 18:12
  • I hear you Balus and I have to admit that to me it is more a language difficulty to elaborate than really time spent trying to understand... I apologize for that, and I'll keep working on it. (Im also learning a lot here) – Vitor Hugo Oct 31 '12 at 18:22

3 Answers3

25

Using encoding=ISO/UTF/WIN... query parameter in the JDBC connection URL has solved the problem.

For example:

jdbc:firebirdsql:url:db?encoding=ISO8859_1
BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
Vitor Hugo
  • 1,106
  • 1
  • 15
  • 35
  • Thanks! Also I had to deal with & escaping like this [jdbc:firebirdsql:url:db?roleName=XYZ&encoding=UTF8] because the ? is for the very first parameter only. – A. Masson Oct 22 '14 at 17:10
  • 1
    @Vitor Hugo Thank you sooo much. I was totally confused wh my connection isn't working anymore. – GreenEyedAndy Aug 28 '23 at 06:58
2

When you don't specify the connection character set in Jaybird (either property encoding using the Firebird character set name, or charSet with a Java character set name), then Jaybird falls back to the Firebird concept of connection character set NONE, which means as much as that the server will not transliterate characters from the storage representation of a (VAR)CHAR column and sends its bytes as is.

This means that Jaybird receives a sequence of bytes in an unknown character set. Jaybird will then use the default character set of your Java installation to convert those bytes to strings. So if the db (or column) character set does not match your Java character set, then it can produce incorrect results. Even worse: reading and writing this way from different systems with different default java character sets can produce total garbage or transliteration errors.

The solution: always specify an explicit connection character set. Even better is to make sure your database has a default character set (or that every (VAR)CHAR column has its character set explicitly defined).

The next version of Jaybird (2.3) will probably refuse to connect if no explicit connection character set was specified to force users to consider this issue (and if they still want the old behavior then they can explicitly specify encoding=NONE).

Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
0

My 2 cents since i got here by Google looking for an answer.. I had an issue with interbase 7.5.80 using legacy jaybird driver (v1.5). Any encoding i used on the connection other than 'NONE' resulted with timeout getting a connection. Eventually i kept using 'NONE':

FBWrappingDataSource dataSource = new FBWrappingDataSource();
dataSource.setDatabase("url");
dataSource.setType("TYPE4");
dataSource.setEncoding("NONE");
dataSource.setLoginTimeout(10);
java.sql.Connection c = dataSource.getConnection("sysdba", "masterkey");

And used getBytes while creating a new String instance with a specific encoding:

String column = new String(rs.getBytes("column"), "Windows-1255");

[rs is ResultSet of course]

Good luck!

baraka
  • 807
  • 8
  • 16