5

I am using Java and Oracle 10g database.

How can I specify the character encoding like UTF-8 for the Oracle database with JDBC?
And how can I find out the current encoding used by JDBC?

oers
  • 18,436
  • 13
  • 66
  • 75
Jack
  • 61
  • 1
  • 1
  • 3

2 Answers2

7

The data transferred by the thin Oracle JDBC driver is always sent as UTF-16 (java's internal representation). The database server will translate that into whatever national character set it has been configured to use (so if the database was set up to be UTF-8, this conversion will happen automatically). Note that the character set is set at the Database level, not at the schema or connection level.

To find out the character set configured on the DB, execute this query:

SELECT value$ FROM sys.props$ WHERE name = 'NLS_CHARACTERSET' ;

(the account you're using to connect to the db will need to have the proper permissions to read system tables to do this)

Chris
  • 22,923
  • 4
  • 56
  • 50
4

I'm not sure I understand the question.

The Oracle database character set is set when the database is created and is quite painful, in general, to change. Your Java application is not going to be able to specify the database character set. You can see what the database and national character set are

SELECT *
  FROM v$nls_parameters
 WHERE parameter LIKE '%CHARACTERSET'

Since your current database character set is ISO 8859-1, it will not be able to store characters from Asian languages. You can follow the instructions on character set migration in the 10g Globalization Support Guide to change the database character set of your existing database. You'll need to work with the DBA to do this since it's going to affect the entire database.

Internally, Java strings are always Unicode (UTF-16 in particular) so there is not much you can do to configure that. The output of your Java application may not be Unicode-- if your Java application is, for example, generating a web site, there is a good possibility that the web pages that are generated are using some non-Unicode character set. But I don't think that's what you're asking about.

Justin Cave
  • 227,342
  • 24
  • 367
  • 384
  • My problem is to handle different languages especail those Asian languages. Our system gets messages from around the world an dth emessages are transported via HTTP body (JSON format). I need to store these messages in a database and then people can retrieve them and display them. I am trying to use utf-8 to support this and set the HTTP content-type encoding to utf-8 but charaters are still messed up in the database – Jack Nov 04 '11 at 21:28
  • @MabroukAboughanaima - What is the database character set and the national character set (that's the output of the query I posted)? Are you storing the data in a VARCHAR2 column? Or in an NVARCHAR2 column? – Justin Cave Nov 04 '11 at 21:59
  • it's WE8ISO8859P1 which is "ISO 8859-1 West European". I am using varchar2. how can I change the database character set to utf-8 in oracle 10G? thanks a lot – Jack Nov 07 '11 at 14:17
  • @MabroukAboughanaima - I added a link in my answer to the character set migration chapter of the Globalization Support Guide that talks about changing the character set of an existing database. – Justin Cave Nov 07 '11 at 15:16