0

So I am working on a crawler, and some of the data I want to store about the sites I crawl is their IP address. I'd prefer to do this without having to hit their server again, so is there anyway to get this information from LWP or WWW::Mechanize after you've already requested the webpage? For instance:

my $mech = WWW::Mechanize->new();
$mech->get($url);
$ip = $mech->url_ip;

I've looked through the documentation of LWP and WWW::Mechanize and I can't seem to find anything, however I've missed things before. So does anyone know of a way to do this with one of these modules? Or even another similar module that can do it? Thanks for the help!

srchulo
  • 5,143
  • 4
  • 43
  • 72

2 Answers2

2

If it is just arbitrary (quad-)A records you want to store, you could also try something like this:

use strictures;
use Perl6::Take qw(gather take);
use Socket 1.96 qw(getaddrinfo getnameinfo AF_INET6 AF_INET SOCK_STREAM NI_NUMERICHOST NIx_NOSERV);
# require 1.96 or better for NIx_NOSERV, ships with Perl 5.14
⋮
my $host = $mech->url->host;
my @ip = gather {
    for my $family (AF_INET6, AF_INET) {
        my ($err, @addrinfo) = getaddrinfo($host, 'http', { family => $family, socktype => SOCK_STREAM });
        warn "Cannot getaddrinfo - $err" if $err;
        for my $ai (@addrinfo) {
            my ($err, $ipaddr) = getnameinfo($ai->{addr}, NI_NUMERICHOST, NIx_NOSERV);
            warn "Cannot getnameinfo - $err" if $err;
            take $ipaddr;
        }
    };
};
LeoNerd
  • 8,344
  • 1
  • 29
  • 36
Sebastian Stumpf
  • 2,761
  • 1
  • 26
  • 34
0

Using Net::DNS. Here's a simple example:

my $resolver = Net::DNS::Resolver->new();
my $response = $Resolver->send("example.com", "A");
my @rr = grep { $_->type eq "A" } $response->answer;
my $ip = $rr[0]->address;