2

I am trying to scrape a website using WWW::Mechanize module. I have configured the mechanize agent with a proxy URL and setting the proxy credentials using credentials method.

Code snippet :

my $url = 'https://abcde.com';
my $proxy_username = 'abc';
my $proxy_password = 'xyz';
my $proxy_url = 'http://xx.xxx.xxx.xxx:13228';

my $mechanize_agent = new WWW::Mechanize('cookie_jar' => {}, 'noproxy' => 1, 'ssl_opts' => { 'verify_hostname' => 0 });
$mechanize_agent->credentials( $proxy_username, $proxy_password );
$mechanize_agent->proxy(['http', 'https'], $proxy_url);
$mechanize_agent->get($url) or die 'Error in get request of $url: $@';

When URL is a plain HTTP, the script fetches and gives back the result. But when I try to hit HTTPS url I get the error

establishing SSL tunnel failed: 407 Proxy Authentication Required

I credentials are valid and I can view the website using proxy URL in Mozilla browser. Also i need t avoid using call to env_proxy() function since the proxy URL is dynamic. How can I get let my script fetch HTTPS request?

All suggestions are welcome! thanks in advance.

Denis Ibaev
  • 2,470
  • 23
  • 29
  • 1
    What happens when you try http://p3rl.org/LWP::Protocol::connect#Proxy-Authentication – daxim Jul 06 '18 at 07:15
  • 1
    What version of LWP you are using? Make sure to use a version newer than 6.06 since versions before that have problems with HTTPS over proxy. – Steffen Ullrich Jul 06 '18 at 08:12
  • @daxim appreciate your prompt reply. Using the proxy authorization in above link I was able to get the content of HTTPS URL. Can you point out why the above code is failing for HTTPS request? – Aniket Golatkar Jul 06 '18 at 09:09
  • @SteffenUllrich Thanks for quick response, I am using LWP's version 6.34 – Aniket Golatkar Jul 06 '18 at 09:11

0 Answers0