2

Following does not work on linux machine.

        Charset charset = Charset.forName("UTF-8");
        CharsetDecoder decoder = charset.newDecoder();

        try {
            FileOutputStream fo = new FileOutputStream("hi.txt");
            PrintStream ps = new PrintStream(fo);
            String msgBody = "ΣYMMETOXH";
            ps.println(decoder.decode(ByteBuffer.wrap(decoder.decode(ByteBuffer.wrap(msgBody.getBytes())).toString().getBytes())));
            ps.close();
            fo.close();
        } catch (CharacterCodingException e) {
            e.printStackTrace();
        }

This code works on windows. What can be the issue? On linux machine decoder does not decode the string.

Harry Joy
  • 58,650
  • 30
  • 162
  • 207
  • Does it throw some kind of exception? Does it print garbage into the file? Also, your file name is quite Windows-specific, although I'm sure Linux is able to create a file with such a convoluted name in your current directory. – andri Nov 18 '11 at 13:27
  • It doesn't throw any exceptions. It creates a file and write same thing as in `msgBody` variable instead of decoded string of `msgBody`. – Harry Joy Nov 18 '11 at 13:28

1 Answers1

3

The problem is that you're using String.getBytes() at least once, possibly twice (your enormously long line is hard to read; using several statements would make it easier to understand). That doesn't specify an encoding, so it'll use the platform default encoding. At that point, you've got a platform dependency... hence the problem.

It's not at all clear what you're trying to achieve, but if you're looking for reasons for platform-specific behaviour, that's the first thing to look at.

Oh, and creating a PrintStream like that will have the same issue... create an OutputStreamWriter with a specific encoding instead.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • Provided "UTF-8" in getBytes() but still faces the same issue. – Harry Joy Nov 18 '11 at 13:33
  • @HarryJoy: It's not clear what you mean by "the same issue" as you haven't explained what you're trying to do vs what's actually happening. Note my bit about using `PrintStream` by the way. There's *lots* of encoding and decoding going on in your code - the problem could be in any of them (or multiple). – Jon Skeet Nov 18 '11 at 13:35
  • I'm trying to decode greek word and save them in a file. I'm getting greek word as `ΣYMMETOXH` in my servlet. When I decode it by above code it remains the same on linux machine but on windows it works fine. I also tried with `OutputStreamWriter`. This also doesn't work. – Harry Joy Nov 18 '11 at 13:41
  • @HarryJoy: It sounds like your servlet is receiving the text incorrectly to start with - as far as I'm aware, "£" isn't a valid character in a Greek word... I wouldn't be surprised if the problem was that the code on Windows is *coincidentally* making two errors which cancel each other out. – Jon Skeet Nov 18 '11 at 13:43
  • I'm entering the greek word as `ΣΥΜΜΕΤΟΧΗ` – Harry Joy Nov 18 '11 at 13:45
  • @HarryJoy: Into the web page? Right. Now work out *exactly* how that's going to be transmitted in the web request, and make sure that you can decode that correctly in the servlet. Once you've *received* it correctly, then you can work out how to save it. It's really important that you tackle one conversion at a time. – Jon Skeet Nov 18 '11 at 13:46
  • can you tell me which encoding will it take on windows when I do getBytes() without argument. – Harry Joy Nov 18 '11 at 13:53
  • @HarryJoy: See `Charset.defaultCharset` - but you shouldn't just try to make your Linux box use the same encoding as the Windows box. You should work out *where* things are going wrong. It's not clear to me that you've mapped out each stage in the conversion process, and worked out what encoding should be used at each stage. – Jon Skeet Nov 18 '11 at 13:56