7

I am working on some project where in i need scrap some information from different website.I am using HtmlUnit for this purpose,But problem is i am unable to traverse through the elements on one page.

Example:

  <div id="some_id">

      <div>

        <div>

           <div>

              ......
                       many divs in between
              ......

               <div id="my_target_div"> some information </div>

                ........

                ........

                 </div>

Now how get div with id my_target_div and information inside that div

Kishan_KP
  • 4,488
  • 6
  • 27
  • 46

2 Answers2

5

Use getHtmlElementById.

Check documentation.

An example:

@Test
public void getElements() throws Exception {
    final WebClient webClient = new WebClient();

    final HtmlPage page = webClient.getPage("http://some_url");
    final HtmlDivision div = page.getHtmlElementById("my_target_div");

    webClient.closeAllWindows();
}

Source.

achudars
  • 1,486
  • 15
  • 25
  • Thanks for the answer but, i have already solved it myself. Anyway i am up voting you for spending your quality time in answering this question, which may help others who have similar problem. – Kishan_KP Aug 07 '13 at 12:36
2
WebClient webClient = new WebClient();
        HtmlPage page;
  HtmlElement div= (HtmlElement) page2.getFirstByXPath("//div[@id='my_target_div']");

This will solve your problem.

Hemin
  • 712
  • 1
  • 14
  • 29