I am working on creating file by querying data form DB and using it to create a file, the situation is as follows: Database: Oracle with charset UTF8 Applicaiton Server: Resin with charset UTF8 Application framework: NTT Intra-Mart (a japanese framework based on Rihno and using javascript as server program language) Need: querying data from Oracle and creating a file by charset [Shift-JIS], the file is used as a middle file that exported by one system and transfered using FTP to another system to import. The file requires to have fixed bytes range for the destination server to locate the specified data to import: e.g. byte 1-10: [user address] byte 11-20 : [user name] However, first I create the file with UTF8, it seems all characters are shown correctly, but when I try to write data with charset [SJIS], there is some full-width charactors become half-width question mark[?], and this may lead to the bytes width shortened and can't get data correctly: e.g. when [user address]'s data like: 1-10-1, the data in the file will become 1?10?1 byte 1-10: [user address], but in current file user address is byte 1-8 byte 11-20 : [user name] could you please give me some advice?
Asked
Active
Viewed 713 times
1 Answers
0
You will have to use charset name Windows-31J
rather than Shift-JIS
.
The data 1-10-1
would be typed from Microsoft IME. Microsoft IME use to U+FF0D
(FULLWIDTH HYPHEN-MINUS) to represent the character -
.
U+FF0D
is not mapped to any character in the Shift-JIS - Unicode mapping in JavaVM. So you will get?
when you convert-
from JVM internal representation (UTF-16) to Shift-JIS with charsetShift-JIS
.U+FF0D
is mapped to0x817C
in Windows-31J - Unicode mapped in JavaVM. So you will get-
when you convert-
from JVM internal representation (UTF-16) to Shift-JIS with charsetWindows-31J
.

SATO Yusuke
- 1,600
- 15
- 39
-
Thank you for your advice, I have communicated with SE, and now we have abandoned the way of reading file with specfied bytes and using CSV file to do that. – auwind Jul 18 '20 at 04:05