0

I am unable to fetch/render the html for the client-side html loading, I tried using HtmlUnit and Jsoup but both of them are not working. I am only getting a blank html.

try (final WebClient webClient = new WebClient()) {
    webClient.getOptions().setThrowExceptionOnScriptError(false);
    webClient.getOptions().setCssEnabled(false);
    webClient.getOptions().setUseInsecureSSL(true);

    HtmlPage page = webClient.getPage(storeUrl);
    webClient.waitForBackgroundJavaScript(10000);

    System.out.println(page.asXml());
}

I have attached my code above.

<body>
    <div id="app">
    </div>
    <script type="text/javascript" src="https://appgallery5.huawei.com//static/2021092315/js/manifest.7678f8af2ad1888b12b7.js">
    </script>
    <script type="text/javascript" src="https://appgallery5.huawei.com//static/2021092315/js/vendor.4515fcb67725b83423f2.js">
    </script>
    <script type="text/javascript" src="https://appgallery5.huawei.com//static/2021092315/js/app.dffbd1139496dce7c98e.js">
    </script>
  </body>
</html>

Above is the output I am getting. What I am doing wrong here?

Janez Kuhar
  • 3,705
  • 4
  • 22
  • 45
  • 1
    jsoup doesn't support AJAX but HtmlUnit [apparently does](https://htmlunit.sourceforge.io/#JavaScript_Support). – Janez Kuhar Nov 09 '21 at 16:27
  • You might find this useful: https://htmlunit.sourceforge.io/faq.html#AJAXDoesNotWork But from what I can see, you're using the API as intended. Perhaps your could add a sample of the web page you're trying to scrape and the version of your HtmlUnit. – Janez Kuhar Nov 09 '21 at 16:33
  • Please add the storeUrl to give us a chance to reproduce you case; additionally please open an issue for HtmlUnit at github. – RBRi Nov 10 '21 at 06:06

0 Answers0