I have an input stream that contains valid utf8 characters 0xC2 0x85 (U+0085). How can I read and print this correctly in java? using byte array doesn't help i think as 0x85 is out of range. Basically, I need to read utf8 characters coming from a socket in java which contains 0xC2 0x85
Asked
Active
Viewed 1,171 times
0
-
1Is this any use to you: http://stackoverflow.com/questions/4964640/reading-inputstream-as-utf-8 – Mike Hogan May 07 '13 at 16:28
-
It doesn't help. I am having something like this.... byte [] buffer = new byte [10240]; socket.getInputStream().read(buffer) – user1101293 May 07 '13 at 16:32
-
Why do you think `0x85` is out of range? – Louis Wasserman May 07 '13 at 16:39
-
bytes are signed in Java, that only means that 0x85 (dec 133) will be interpreted (as byte) as 133-256=-123. That's no problem. Anyway, that's irrelevant for your goal, to translate bytes as characters, use an `InputStreamReader` that does the work for you. – leonbloy May 07 '13 at 16:48
1 Answers
1
If you have a valid stream in UTF-8 (is) then use
Reader rdr = new InputStreamReader(is, "UTF-8");
this will convert UTF-8 bytes into Java chars

Evgeniy Dorofeev
- 133,369
- 30
- 199
- 275