0

I don't understand how to access an HTML table from a document.

I am playing with this link: Scotia Bank Jobs

The idea is to click on "Next page" button several times and gather all small HTML tables into one.

When I open the link with WWW::Mechanize::Firefox, I can get the whole document (and first page HTML table) with

 my $cont= $mech->content( format => 'html' );

After that I click on "Next page" button with

my $id="search_result_next_page_link"; 
$mech->click({ xpath => qq{//*[\@id="$id"]}, synchronize => 0 }); 

I can click the button many times and the table is being changed inside the document, but I can not use $mech->content any more, because the URL is the same and content is not changing.

I was trying something like :

my $tt= $mech->xpath('/html/body/form/div[4]/div/main/div/div[3]/section/div/div/table/text()');
print $tt;

but it prints "0".

I have a feeling that I am very close, any idea how to get HTML table after every click????

i alarmed alien
  • 9,412
  • 3
  • 27
  • 40
Andrey
  • 11
  • 3
  • You could directly make the requests that clicking on the button is replicating -- if you look at them in the Inspector panel of your browser, they are simple `GET` requests with an incrementing page number, and the data returned is an HTML table. – i alarmed alien Oct 26 '14 at 22:52
  • @ialarmedalien I am looking at the button in inspector and all I see is `Javascript:Paging($url, '2', 'True', 'False')`, where $url is the same URL of the page. '2' is the next page number here. But where to put it in URL for GET request? Can you, please, be more specific? – Andrey Oct 27 '14 at 12:02
  • Reading the documentation, I can see that it's easy to get PNG screenshots of the page, but how to get the table's HTML? – Andrey Oct 28 '14 at 17:18

1 Answers1

0

Finally....I had to bother the author of the WWW::Mechanize::Firefox module and he provided with the solution how to get the HTML code of this table. Script should be something like this:

@tt= $mech->selector('.tableSearchResults'); $HTMLtable= $tt[0]->{innerHTML};

Don't forget, you need to wait after every click (or create a cycle, waiting for the element to show up).

Andrey
  • 11
  • 3