0

I am trying to parse the link http://finance.yahoo.com/quote/VZ/key-statistics?p=VZ with code below and page does not load completely . I have tried using webClient.waitForBackgroundJavaScript(500000); & Thread.sleep(1000); with out any success . My objective is to read table contents under Valuation Measures but that is never getting loaded . Any help is appreciated .

import java.sql.Timestamp;
import java.util.ArrayList;
import com.gargoylesoftware.htmlunit.BrowserVersion;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.DomElement;
import com.gargoylesoftware.htmlunit.html.HtmlAnchor;
import com.gargoylesoftware.htmlunit.html.HtmlPage;

public class LocalScreenScappingTest {

public static void main(String[] args) {

    try {

        java.util.logging.Logger.getLogger("com.gargoylesoftware")
                .setLevel(java.util.logging.Level.OFF);
        WebClient webClient = new WebClient(BrowserVersion.CHROME);
        webClient.getOptions().setJavaScriptEnabled(true);
        webClient.getOptions().setCssEnabled(true);
        webClient.getOptions().setThrowExceptionOnScriptError(false);
        webClient.getOptions().setPrintContentOnFailingStatusCode(false);
        // webClient.waitForBackgroundJavaScript(500000);
        HtmlPage page = webClient
                .getPage("http://finance.yahoo.com/quote/VZ/key-statistics?p=VZ");
        // Thread.sleep(1000);
        System.out.println(page.asText());
        // HtmlTable table = (HtmlTable)
        // page.getFirstByXPath("//*[@id='main-0-Quote-Proxy']/section/div[2]/section/div/section/div[2]/div[1]/div[1]/div/table");

    } catch (Exception ex) {

        System.out.println((new Timestamp(new java.util.Date().getTime()))
                + ": " + ex.toString());
    }

}

}
xes_p
  • 501
  • 4
  • 14
  • You may want to consider using some parsing libraries like Jsoup: https://jsoup.org/ or get data directly from Yahoo Finance API: http://meumobi.github.io/stocks%20apis/2016/03/13/get-realtime-stock-quotes-yahoo-finance-api.html – Defozo Sep 03 '16 at 23:17
  • I tried Jsoup and that also returns same . I will try Yahoo Finance API. – Lostsomewhere Sep 03 '16 at 23:34
  • Attach logcat logs. – Defozo Sep 03 '16 at 23:37

1 Answers1

0

If you inspect the page through browser's developer tools, you'll see that the 'finance.yahoo.com' files are mostly JSON type. The HTML file of the webpage doesn't contain the table you want. Response You'll have to determine what file contains the table and use some Json parser to get it.

Alexiy
  • 1,966
  • 16
  • 18