0

I am working on scraping of a website through HtmlUnit. I have 2 drop down lists in it. The 2 drop down lists are linked such that the selection of drop down list 1 option makes a JavaScript call on onChange(). Based on this selection, the values in the drop down list 2 is to be populated.

Here's the code that I have written: Suppose i select "4-Medak" option in the select tag (id:ddldistlist), then it should populate it's corresponding option values in the select tag (id: ddlaclist) through a JS call. But this ain't happening when i'm doing it through HtmlUnit.

HTML code:

 <select name="ddldistlist" onchange="javascript:setTimeout('__doPostBack(\'ddldistlist\',\'\')', 0)" id="ddldistlist" style="font-weight:bold;width:150px;">
    <option value="-Select-">-Select-</option>
    <option value="1">1-Adilabad</option>
    <option value="2">2-Nizamabad</option>
    <option value="3">3-Karimnagar</option>
    <option value="4" selected="selected">4-Medak</option>
    <option value="5">5-Rangareddy</option>
    <option selected="selected" value="6">6-Hyderabad</option>
</select>
<select name="ddlaclist" id="ddlaclist" style="font-weight:bold;width:155px;">
    <option value="-Select-">-Select-</option>
    <option value="1">City 2</option>
    <option value="2">City 1</option>
</select>
public void scrapeWebsite() {
    WebClient webClient = new WebClient(BrowserVersion.CHROME);
    webClient.getOptions().setThrowExceptionOnScriptError(false);
    webClient.getOptions().setJavaScriptEnabled(true);
    webClient.getOptions().setCssEnabled(false);
    webClient.getOptions().setRedirectEnabled(true);
    webClient.setAjaxController(new NicelyResynchronizingAjaxController());
    webClient.getCookieManager().setCookiesEnabled(true);
    webClient.getOptions().setTimeout(90000);

    String url = "www.xyz.com";

    try {
            HtmlPage page = webClient.getPage(url);

            // drop down list 1
            HtmlSelect districtNameSelect = (HtmlSelect) page.getElementById("ddldistlist");
            HtmlOption districtNameOption = districtNameSelect.getOptionByValue("6");
            districtNameSelect.setSelectedAttribute(districtNameOption,true);

            String js = districtNameSelect.getOnChangeAttribute();
            page = (HtmlPage) page.executeJavaScript(js).getNewPage();

            webClient.waitForBackgroundJavaScript(5000);

            // drop down list 2
            HtmlSelect cityNameSelect = (HtmlSelect) page.getElementById("ddlaclist");
            System.out.println("City name list : " + cityNameSelect.asText());
        } catch (Exception exception) {
            System.out.println("Exception:" + e.getMessage());
        }
    }
}
Ahmed Ashour
  • 5,179
  • 10
  • 35
  • 56
shubh
  • 253
  • 2
  • 3
  • 9
  • It's not clear - is this dropdown behaviour in the page you are scraping, or in a page you are building? – Roamer-1888 Nov 09 '15 at 04:22
  • Forgot to mention that @Roamer-1888. Yeah it's they're 2 different drop "select tags" in the webpage. Wait let me update this question. – shubh Nov 09 '15 at 04:53
  • You need to provide the complete case (hopefully [minimal](http://htmlunit.sourceforge.net/submittingJSBugs.html)), so others can reproduce the issue – Ahmed Ashour Nov 09 '15 at 05:15
  • Hi @AhmedAshour, i have now updated my problem statement. Hope that would give a clear understanding. Kindly check now. – shubh Nov 09 '15 at 05:38
  • `htmlSelect.setSelectedAttribute` should automatically trigger the JavaScript, however option with value `6` is already selected, so it is ignored. Try with some other value, otherwise please provide "complete" page with/without URL, for example, what does `__doPostBack` do? – Ahmed Ashour Nov 09 '15 at 06:23
  • I've tried that earlier without the external JS. But it didn't work. So i had to externally execute JS this way. here's the website URL: http://ceoaperms.ap.gov.in/TS_Search/search.aspx – shubh Nov 09 '15 at 07:46
  • Should `e.getMessage()` not be `exception.getMessage()`? – Roamer-1888 Nov 09 '15 at 09:55

0 Answers0