My problem is fairly simple:
new InputStreamReader(is, "UTF-8");
Makes β and ・look like question marks.
Which encoder should I use to see those characters correctly?
My problem is fairly simple:
new InputStreamReader(is, "UTF-8");
Makes β and ・look like question marks.
Which encoder should I use to see those characters correctly?
You should use whichever encoding your input data is really in. We can't tell you that for you, although if you provide the bytes which are meant to represent those characters, we may be able to suggest some possibilities.
While you can sometimes apply some heuristics to guess at an encoding, you really should know it based on where the data is coming from. In this case you haven't given us any hint whatsoever what your input is - if it's from a web response, you should look at the Content-Type
header of the response. If it's from a file, it really depends on what produced that file.
EDIT: Now we know that it is a web response, you don't have to go header-diving yourself, of course. You can use an HTTP client library which will download the data for you and decode it as a string itself.
Taken from The Java 5.0 Charset documentation.
Charset Description
US-ASCII Seven-bit ASCII, a.k.a. ISO646-US, a.k.a. the Basic Latin block of the Unicode character set
ISO-8859-1 ISO Latin Alphabet No. 1, a.k.a. ISO-LATIN-1
UTF-8 Eight-bit UCS Transformation Format
UTF-16BE Sixteen-bit UCS Transformation Format, big-endian byte order
UTF-16LE Sixteen-bit UCS Transformation Format, little-endian byte order
UTF-16 Sixteen-bit UCS Transformation Format, byte order identified by an optional byte-order mark
So try all of these strings in your second parameter until you get the desired encoding.
Just adding to what the others said the final result is going to be UTF-8 while in Java, and that's going to be able to handle any characters you have. However, the question here is how do you read it, and that depends on what encoding the file is written in which, apparently, is not UTF-8.