I have a String representing an url, and I need to get its HTML source code. Problem is, i can't find a way to get it in the correct encoding (letters like à è ì ò ù are not read properly and just received as "??").
What's the best way? I came across lots of solutions but no one apparently is working.
Here's my code
private String getHtml(String url, String idSession) throws IOException
{
URL urlToCall = null;
String html = "";
try
{
urlToCall = new URL(url);
}
catch (Exception e)
{
e.printStackTrace();
return "";
}
HttpURLConnection conn;
conn = (HttpURLConnection) urlToCall.openConnection();
conn.setRequestProperty("cookie", "JSESSIONID=" + idSession);
conn.setDoOutput(false);
conn.setReadTimeout(200*1000);
conn.setConnectTimeout(200*1000);
ByteArrayOutputStream output = new ByteArrayOutputStream();
InputStream openStream = conn.getInputStream();
byte[] buffer = new byte[ 1024 ];
int size = 0;
while( (size = openStream.read( buffer ) ) != -1 ) {
output.write( buffer, 0, size );
}
html = output.toString("utf-8");
return html;
}