0

I have an input stream that contains valid utf8 characters 0xC2 0x85 (U+0085). How can I read and print this correctly in java? using byte array doesn't help i think as 0x85 is out of range. Basically, I need to read utf8 characters coming from a socket in java which contains 0xC2 0x85

user1101293
  • 49
  • 1
  • 7
  • 1
    Is this any use to you: http://stackoverflow.com/questions/4964640/reading-inputstream-as-utf-8 – Mike Hogan May 07 '13 at 16:28
  • It doesn't help. I am having something like this.... byte [] buffer = new byte [10240]; socket.getInputStream().read(buffer) – user1101293 May 07 '13 at 16:32
  • Why do you think `0x85` is out of range? – Louis Wasserman May 07 '13 at 16:39
  • bytes are signed in Java, that only means that 0x85 (dec 133) will be interpreted (as byte) as 133-256=-123. That's no problem. Anyway, that's irrelevant for your goal, to translate bytes as characters, use an `InputStreamReader` that does the work for you. – leonbloy May 07 '13 at 16:48

1 Answers1

1

If you have a valid stream in UTF-8 (is) then use

   Reader rdr = new InputStreamReader(is, "UTF-8");

this will convert UTF-8 bytes into Java chars

Evgeniy Dorofeev
  • 133,369
  • 30
  • 199
  • 275