0

Hey I a having a little trouble here. I am doing File Writing at school and we got the challenge of reading a webpage. How is it possible to do it? I had a go with a JSoup and an Apache plugin, but neither worked, but I have to use the net import

I am a bit of a noob at coding, so there will probably be a couple of errors! Here is my code:

    URL oracle = new URL("http://www.oracle.com/");
    BufferedReader br = new BufferedReader(new InputStreamReader(oracle.openStream()));

    String inputLine;
    while ((inputLine = br.readLine()) != null){
         System.out.println(inputLine);
    }
    br.close();

There is no output from the program, and earlier I managed output but it was in the form of HTML, however I deleted that code, ironically looking for a fix for that issue.

Any help or solutions would be greatly appreciated! Thank you all very much!

Samuelf80
  • 119
  • 1
  • 10

1 Answers1

2

The code example is from Reading Directly from a URL, but the tutorial is old. The url http://www.oracle.com now redirects to https://www.oracle.com/ but you don't follow the redirect.

If you use a URL that does not redirect, like http://www.google.com you will see that the code works.

If you want a more robust program that handles redirects, you'll probably want to use a HttpURLConnection instead of the basic URL, as it has more features for you to use.

Matt
  • 3,677
  • 1
  • 14
  • 24
  • Caveat, the url for google (`http://www.google.com`) might redirect to the localized version/site if it's accessed outside the US. (i.e. from England, it may redirect you to `https://www.google.co.uk`) – blurfus Jan 10 '17 at 23:02
  • 1
    Good point. There's a fair chance that a lot of modern websites will use redirects in one way or another. Your best bet is to use a URL for which you know the HTTP response. – Matt Jan 10 '17 at 23:03
  • It works! But now I can't seem to print it - it all is formatted as HTML when printed as a stringbuilder and string. I tried the stringbuilder with and without the html tags, but no avail. – Samuelf80 Jan 11 '17 at 15:04
  • When you download a web page, you will receive the raw HTML. Are you trying to print it to paper? That won't work. You need a program that will render the HTML (like a web browser) to print it as it looks on the web. – Matt Jan 14 '17 at 22:01