1

I have a list of 2500 websites and need to grab a thumbnail screenshot of them. How do I do that?

I could try to parse the sites either with Perl or Python, Mechanize would be a good thing. But I am not so experienced with Perl.

daxim
  • 39,270
  • 4
  • 65
  • 132
zero
  • 1,003
  • 3
  • 20
  • 42
  • You could sign up at snap.com and then use Perl to grab the snapshot images from them - check their Terms of Service first though – Grant McLean Dec 06 '11 at 00:33

1 Answers1

10

Here is Perl solution:

  use WWW::Mechanize::Firefox;
  my $mech = WWW::Mechanize::Firefox->new();
  $mech->get('http://google.com');

  my $png = $mech->content_as_png();

From the docs:

Returns the given tab or the current page rendered as PNG image.

All parameters are optional. $tab defaults to the current tab. If the coordinates are given, that rectangle will be cut out. The coordinates should be a hash with the four usual entries, left,top,width,height.

This is specific to WWW::Mechanize::Firefox.

Currently, the data transfer between Firefox and Perl is done Base64-encoded. It would be beneficial to find what's necessary to make JSON handle binary data more gracefully.

gangabass
  • 10,607
  • 2
  • 23
  • 35
  • well - how to add the diff erent urls.. how if i do read them into from a file. In other words i store the urls in a file... And afterwards i put out the results in another directory. what do you think? – zero Dec 06 '11 at 20:04
  • Well i can quickly find this info in the Perl documentation: `perldoc -f open` – gangabass Dec 07 '11 at 03:44