I create the following code to translate string with given charset to some another. Please, see it bellow
public static String convertCharsetOfTheString(String strToConvert,String targetCharsetName) throws UnsupportedEncodingException {
CharsetDetector detector = new CharsetDetector();
detector.setText(strToConvert.getBytes());
CharsetMatch detect = detector.detect();
String currentCharsetName = detect.getName();
Charset currentCharset=Charset.forName(currentCharsetName);
Charset targetCharset=Charset.forName(targetCharsetName);
ByteBuffer wrap = ByteBuffer.wrap(strToConvert.getBytes(targetCharsetName)); //.wrap(strToConvert.getBytes());
CharBuffer decode = currentCharset.decode(wrap);
ByteBuffer encode = targetCharset.encode(decode);
return new String(encode.array(),targetCharsetName);
}
And for some symbols, I have encoding/decoding error. I.e. hiragana letter じ became unreadable.
I assume it's because hiragana have 3 bytes instead of two. But don't know how to fix the problem.
Does anybody know how to fix it?