7

I see that you can specify UTF-16 as the charset via Charset.forName("UTF-16"), and that you can create a new UTF-16 decoder via Charset.forName("UTF-16").newDecoder(), but I only see the ability to specify a CharsetDecoder on InputStreamReader's constructor.

How so how do you specify to use UTF-16 while reading any stream in Java?

IAmYourFaja
  • 55,468
  • 181
  • 466
  • 756
  • 1
    If the class allows it, you can do so at the boundary between a byte stream and character stream. (InputStreamReader is one such class, for other Reader that doesn't give you the option to specify character set, just wrap it around InputStreamReader). Lower construct at the level of InputStream (byte stream) doesn't have the concept of character set. – nhahtdh Feb 26 '13 at 20:04

1 Answers1

11

Input streams deal with raw bytes. When you read directly from an input stream, all you get is raw bytes where character sets are irrelevant.

The interpretation of raw bytes into characters, by definition, requires some sort of translation: how do I translate from raw bytes into a readable string? That "translation" comes in the form of a character set.

This "added" layer is implemented by Readers. Therefore, to read characters (rather than bytes) from a stream, you need to construct a Reader of some sort (depending on your needs) on top of the stream. For example:

InputStream is = ...;
Reader reader = new InputStreamReader(is, Charset.forName("UTF-16"));

This will cause reader.read() to read characters using the character set you specified. If you would like to read entire lines, use BufferedReader on top:

BufferedReader reader = new BufferedReader(new InputStreamReader(is, Charset.forName("UTF-16")));
String line = reader.readLine();
Isaac
  • 16,458
  • 5
  • 57
  • 81