3

I am writing a ColdFusion program that uses cfquery to get data from an AS/400 iSeries table and then output that data to a web page. Some times the Data is in Chinese, but it does not output the Chinese characters correctly.

I built the query below for testing,

<cfprocessingdirective pageEncoding="UTF-8" />
    <cfquery name="Test" Datasource = "AS400">
        select dsc1 from sales where ref = '123456'
    </cfquery>

<cfoutput>#test.dsc1#</cfoutput>

The result should be "M5方头螺栓" but I only get "M5". I did another test running just:

<cfset x = "M5方头螺栓"/>
<cfoutput>#x#</cfoutput>

and it displays the Chinese no problem.

Since ColdFusion can display the characters when they are written out in the code, but not when it goes to get the data through SQL, it seems like the issue is with either my ODBC settings or my ColdFusion Server Data Source Settings but I'm not familiar enough with these settings to know what needs to be changed to get this working.

MHall
  • 31
  • 2
  • 1
    Under the ColdFusion administrator datasource settings select the appropriate datasource you are using. Then click on the "show advanced settings" button. That will show an option for "String Format" _Enable High ASCII characters and Unicode for data sources configured for non-Latin characters_. Check that box for this datasource. And as Adrian pointed out, the data must be stored in an NVARCHAR field in the database. – Miguel-F May 15 '17 at 20:05
  • I am using Coldfusion 2016, I checked the Advanced Settings for the data source I am using and there is no option showing "String Format" or anything mentioning Enable High ASCII characters and Unicode for data sources configured for non-Latin characters. There is an option showing: CLOB -- Enable long text retrieval (CLOB). BLOB -- Enable binary large object retrieval (BLOB). I tested both of these but neither made a difference. My data source is not set as a Trusted Source, would this make a difference? – MHall May 15 '17 at 20:27
  • Ah yeah right, the ODBC datasources do not have that option. Look to see if that setting can be configured with the ODBC driver you are using. – Miguel-F May 15 '17 at 20:32
  • Ive started to look around in the ODBC settings to see if there is anything i can do. I've also added an ODBC tag to the question since that seems to be more so the issue. Thanks. – MHall May 15 '17 at 21:03
  • A person I work with did something like ` select Binary(DSC1,60) as DSC1 from sales where ref='123456' #CharsetEncode(DSC1,"utf-16")# ` which works, So it looks like its getting the data in utf-16. When I try `` I get an error saying : "Cannot use the charset UTF-16 because the file has a Byte Order Mark indicating it uses the charset UTF8. " Is there a way to change the BOM to UTF16? – MHall May 16 '17 at 03:47
  • 1
    Is this a DB2 database? Maybe you can try and use the provided JDBC driver for "DB2 Universal Database" instead. – Miguel-F May 16 '17 at 13:40
  • That Worked Great!!!!! Thank you!!! – MHall May 16 '17 at 14:46
  • 1
    Awesome! It was a shot in the dark but glad it worked for you. – Miguel-F May 16 '17 at 15:12

2 Answers2

4

A workaround was found and discussed within the comments. Adding some details here as an answer for future visitors to this page.

There are a couple of considerations when dealing with Unicode (Chinese) characters:

  • The data type for the database table must be set to nvarchar
  • The form processing script (CFML) must be set to utf-8
    • I believe ColdFusion defaults to this but you can specify the setting to be sure.
      For example: <cfprocessingDirective pageEncoding=”utf-8″>
  • Enable "String Format" within the ColdFusion datasource settings
    • Under the ColdFusion administrator datasource settings select the appropriate datasource you are using. Then click on the "show advanced settings" button. That will show an option for "String Format" Enable High ASCII characters and Unicode for data sources configured for non-Latin characters. Select this option and save the datasource.

The issue for the OP was that they were using an ODBC datasource and the "String Format" option was not available. After some research and the lack of finding any way to configure an ODBC datasource for that setting I recommended trying to use the builtin JDBC driver for "DB2 Universal Database" that comes with ColdFusion. Switching to that driver resolved this issue for the OP.

From the comments

Good info. Though is "Enable String Format..." necessary with the added support for cf_sql_nvarchar in CF10+? – @Leigh

I do believe Leigh is correct that the newer versions of ColdFusion (10 and later) have much better support for nvarchar fields.

Also to note, it looks like some older versions of ColdFusion don't always work with the installed DB2 Universal Driver, and it doesn't look like the older standard versions even have it, I'm not sure if the newer ones have it either, but using the "other" option with jt400.jar, should also work. - @MHall

Miguel-F
  • 13,450
  • 6
  • 38
  • 63
  • Good info. Though is "Enable String Format..." necessary with the added support for cf_sql_nvarchar in CF10+? – Leigh May 17 '17 at 00:41
  • also to note, it looks like some older versions of Coldfusion dont always work with the installed DB2 Universal Driver, and it dosn't look like the older standard versions even have it, im not sure if the newer ones have it either, but using the "other" option with jt400.jar, should also work. – MHall May 17 '17 at 01:31
  • Good points guys. I will add them to the answer so people will see them. – Miguel-F May 17 '17 at 12:03
1

You've already proven that CF can output UTF-8 characters correctly. Have you tried running that query in the DB console or UI? Do you get the correct charaters?

If the characters were stored as VARCHAR and not NVARCHAR, then there's nothing you can do. The data has to have been properly stored in the first place.

If the characters are stored correctly in the DB, try adding <cfprocessingdirective pageEncoding="utf-8"> at the top of the request. CF should be using UTF-8 by defualt, but this will force the correct character set if, for some reason, it isn't.

Adrian J. Moreno
  • 14,350
  • 1
  • 37
  • 44
  • The data is in NVARCHAR. I have tested in Debug and I get M5. In my code i had: I removed the / and there is no difference. I'm new to this and not sure how to make my code look pretty when its a comment, sorry for any inconvenience. – MHall May 15 '17 at 20:22
  • @MHall - For comments, use [back ticks](https://en.wikipedia.org/wiki/Grave_accent) :-) – Leigh May 15 '17 at 22:29