Strings transcoding in Java

Question

I've found a piece of code recently, which does the following:

String s = ... // whatever
...
s = new String(s.getBytes(myEncoding), myEncoding);

For me it appears to be absolutely non-sense.

Is it possible that under certain circumstances (some specific combination of locale settings, used technologies, etc.), this code will do something useful?

Thanks in advance

score 2 · Answer 1 · answered Nov 28 '11 at 17:56

2

yes, that code is generally nonsense. yes, it's possible that that code could be doing "something" to the string (probably not anything good). generally speaking, if you have already incorrectly converted bytes to chars, trying to re-convert is rarely going to give you legitimate results. (there may be isolated instances where the right combination of character encodings may work).

answered Nov 28 '11 at 17:56

jtahlborn

52,909
5
76
118

1

This could simply be bad semantics but important code. myEncoding could reference a custom CharsetEncoder. A CharsetEncoder can replace unknown characters with a custom byte array and then strip it on re-encoding. You could probably implement something like html escaping using a custom CharsetEncoder. If that's the case, this code seems too-clever-for-its-own-good. I'd recommend replacing the encode-decode line with a better-named method. – ccoakley Nov 28 '11 at 18:21
2 ccoakley myEncoding is plain ISO8859-1 encoding – edio Nov 28 '11 at 18:24

Strings transcoding in Java

1 Answers1