2

I'm using the following Perl code to get data from https://www.otcmarkets.com/research/stock-screener/api?sortField=symbol&sortOrder=asc&page=0&pageSize=20000:

use warnings;
use WWW::Mechanize::GZip;

my $TempFilename = "D:\\temp\\test.txt";

my $mech = WWW::Mechanize::GZip->new(
    ssl_opts => {
        verify_hostname => 0,
    },
);

$mech->get("https://www.otcmarkets.com/research/stock-screener/api?sortField=symbol&sortOrder=asc&page=0&pageSize=20000");
open(OUT, ">", $TempFilename);
binmode(OUT, ":utf8");
print OUT $mech->content;
close(OUT);

Unfortunately the request always times out, and my temporary file always contains

read timeout at C:/Strawberry/perl/vendor/lib/Net/HTTP/Methods.pm line 268.

However, if I point a web browser to the same URL, I get a bunch of JSON data that looks like this, which is what I am seeking:

"{\"count\":17114,\"pages\":1,\"stocks\":[{\"securityId\":194057,\"reportDate\":\"Jan 26, 2022 12:00:00 AM\",\"symbol\":\"AAAIF\",\"securityName\":\"ALTERNATIVE INVESTMENT TR\",\"market\":\"Pink ...

My question is whether there is any way I can modify my script so that it saves the same data that my web browser is able to display instead of the timeout message to my file.

Thanks

  • You can probably use `:content_cb`. See the LWP::UserAgent docs (of which WWW::Mechanize is a subclass). – ikegami Jan 27 '22 at 22:12
  • Is there any valid reason you chosen [WWW::Mechanize::GZip](https://metacpan.org/pod/WWW::Mechanize::GZip) as a method to capture generated JSON? – Polar Bear Jan 28 '22 at 01:11

1 Answers1

2

Change the user agent, the default is a string of the form libwww-perl/#.###. But some sites are sensible to that. Also, you can use WWW::Mechanize directly and set a concrete timeout parameter (in seconds). Like this:

use strict;
use warnings;
use WWW::Mechanize;

my $TempFilename = "c:\\temp\\test.txt";
my $url = "https://www.otcmarkets.com/research/stock-screener/api?sortField=symbol&sortOrder=asc&page=0&pageSize=20000";

my $mech = WWW::Mechanize->new(
    agent    => "Mozilla/5.0",
    timeout  => 15,
    # ssl_opts => { verify_hostname => 0 },
);

$mech->get($url);
open my $f_out, ">", $TempFilename or die "Cannot open file";
binmode $f_out, ":utf8";
print $f_out $mech->content;
close $f_out;
Miguel Prz
  • 13,718
  • 29
  • 42