0

I have this, but I was wondering if there is a faster way:

        URL url=new URL(page);
        InputStream is = new BufferedInputStream(url.openConnection().getInputStream());
        BufferedReader in=new BufferedReader(new InputStreamReader(is));
        String tmp="";
        StringBuilder sb=new StringBuilder();
        while((tmp=in.readLine())!=null){
            sb.append(tmp);
        }
Lengoman
  • 904
  • 2
  • 12
  • 22
  • This code uses your system's default character set... which is fine as long as the page content uses the same character set. – dnault Aug 01 '12 at 21:46

3 Answers3

5

Probably network is the biggest overhead, there isn't much you can do on Java code side. But using IOUtils is at least much faster to implement:

String page = IOUtils.toString(url.openConnection().getInputStream());

Remember to close underlying stream.

Tomasz Nurkiewicz
  • 334,321
  • 69
  • 703
  • 674
  • that is right, the IOUtils method does the same thing in one line except it uses StringBuffer instead of StringBuilder – Jan Hruby Aug 01 '12 at 21:46
  • 1
    +1, this is simple and probably fast enough. If not, I suppose you could read the Content-Length header and pre-allocate a byte buffer exactly the same size as the content, then pass the byte array to a String constructor... but that seems like overkill. – dnault Aug 01 '12 at 21:51
3

if you need manipulating with your html, find some library. Like for example jsoup.

jsoup is a Java library for working with real-world HTML. It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods.

Example:

Document doc = Jsoup.connect("http://en.wikipedia.org/").get();
Elements newsHeadlines = doc.select("#mp-itn b a");
lukastymo
  • 26,145
  • 14
  • 53
  • 66
0

If you're using Apache Commons IO's IOUtils as Tomasz suggests, there's an even simpler method: toString(URL), or its preferred cousins that take a charset (of course that requires knowing the resource's charset in advance).

String string = IOUtils.toString( new URL( "http://some.url" ));

or

String string = IOUtils.toString( new URL( "http://some.url" ), "US-ASCII" );
Ben Schreiber
  • 377
  • 2
  • 13