1

I am working with java to get some information from a web page. The problem is the information I need is generated by a JavaScript function. How can get this information because the code below brings only page information before full loaded (which means I can get only frames of the page).

code1.

URL target = new URL()
HttpURLConnection con = (HttpURLConnection)target.openConnection();
StringBuffer sb = new StringBuffer();
String line = "";
BufferedReader br = null

try {
    br = new BufferedReader(new InputStreamReader(con.getInputStream()));

    while((line = br.readLine()) != null){
        sb.append(line);
    }
} catch(Exception e){
    e.printStackTrace();
}

Is there a way to know when the page has fully loaded in java? (Extra library can be an answer, but I wish to do it in java only). Thanks.

saintedlama
  • 6,838
  • 1
  • 28
  • 46
Juneyoung Oh
  • 7,318
  • 16
  • 73
  • 121
  • This code will only fetch a single resource that is identified by the URL, and not any referenced resources inside that resource. For that you really need something like a web browser – Jason Sperske Jan 27 '14 at 07:33
  • You could possibly wire something like this with http://htmlunit.sourceforge.net/ – Jason Sperske Jan 27 '14 at 07:36

2 Answers2

3

Your are making an HTTP request from java, this returns a text stream, the concept of "page loaded" is a browser related concept, the browser requests the content of the page (same as you are doing) and then renders the page and executes Javascript. It's the browser that executes Javascript.

If you want to make this only in Java, you need to implement a headless browser (a browser without user interface), or at least get the Javascript in the page you are loading and executing this. Doing this from scratch in pure Java is not an easy task, check out HtmlUnit for an example.

Oak
  • 26,231
  • 8
  • 93
  • 152
AlfredoCasado
  • 818
  • 5
  • 7
1

Java won't execute any client-side JavaScript. It will just read it. If you want a browser, use a browser.

user207421
  • 305,947
  • 44
  • 307
  • 483
  • Strictly speaking the OP could leverage Rhino (by no means a substitute for a full DOM that a browser would offer natively), it would help to narrow down the ambition of this app to just the JavaScript function they want to extract, execute and read from. – Jason Sperske Jan 27 '14 at 07:39
  • Thanks. I thought it was possible. – Juneyoung Oh Jan 27 '14 at 07:58