I have problem to parse data with my crawler I'm writting in perl from freebase. I'm trying to pull out data from this URL:
(example)
It is page with IMDB_ID's and MID's. I'm trying to extract links. Problem is I have only 100 results and when I reach 'bottom of page' in Mozilla Firefox I get more results (11 more). I'm using LWP::UserAgent.
Anybody knows solution with some sample code, how to automatically pull out all 111 links of MID's from this page.
Here is my code:
#!/usr/bin/perl
use LWP::Simple;
use LWP::UserAgent;
use HTTP::Request;
use HTTP::Response;
use HTML::LinkExtor;
$URL = 'http://www.freebase.com/authority/imdb/title?ns&lang=en& filter=%2Ftype%2Fnamespace%2Fkeys×tamp=2013-11-20×tamp=2013-11-21'; #URL
$browser = LWP::UserAgent->new();
$browser->timeout(10);
my $request = HTTP::Request->new(GET => $URL);
my $response = $browser->request($request);
if ($response->is_error()) {printf "%s\n", $response->status_line;}
$contents = $response->content();
my ($page_parser) = HTML::LinkExtor->new(undef, $URL);
$page_parser->parse($contents)->eof;
@links = $page_parser->links;
foreach $link (@links) {
$_ = $$link[2];
# if (index($$link[2], $_) != -1) {
$mid = $$link[2];# if m/http:\/\/www\.freebase\.com\/m\//;
#$mid =~ s/\?links=//;
#$mid =~ s/http:\/\/www.freebase.com\///;
print "MID $mid\n";
}