4

I'm using WWW::Mechanize to load the catalog from our product provider into our database. I run this script every 2 hours everyday and it completes in arround 12 minutes by using around 50 simultaneous threads.

Everything was working perfectly, until this weekend. They put their website offline for a scheduled maintenance and, once they where online again, my script no longer worked. After analyzing things, it comes down to the following code failing:

use strict;
use warnings;

use WWW::Mechanize;

my $mec = WWW::Mechanize->new;
$mec->get('https://www.imstores.com/Ingrammicromx/login/login.aspx');

print $mec->content;

The code dies (after about 60 seconds) with the following message:

Error GETing https://www.imstores.com/Ingrammicromx/login/login.aspx:
Can't connect to www.imstores.com:443 at test.pl line 7.

Now, these are the points that are making me difficult to find the problem:

  1. It's not network-related - if I visit the same URL from any of my browsers, I get the page.

  2. If I try the same code on a remote machine that contains an exact copy of my Perl installation, it works.

  3. If I use Net::SSL before WWW::Mechanize, it takes a very LONG time, but finally gets the page.

  4. If I try any other SSL page, like 'https://www.paypal.com', it works and very fast.

  5. Then again, it was working before their scheduled maintenance.

I'm not sure what else to try. If I switch to the non-SSL version, it works, but I don't want to do that since we automate purchasing operations.

Along with many things that have crossed my mind, thinking about why it works on the remote machine and why I can open the page in my browsers in the local one:

Is it possible to get blocked with my SSL public key? Is that possible? If so, what public key is LWP/Mechanize using for SSL sessions and how can I use a different one?

Some data on my current setup:

Thanks in advance for any helpful comment.

Francisco Zarabozo
  • 3,676
  • 2
  • 28
  • 54
  • There's no public key for SSL; I think you're confusing it with SSH. Do you have the same problem if you do an SSL connection directly from the command line (`openssl s_client -connect www.imstores.com:443`)? That might give you some information. – Jenny D Apr 16 '13 at 06:52
  • Are you sure you have not been blocked by the web server? – AnFi Apr 16 '13 at 06:59
  • (1) might not be true. Your browser could be using a proxy. – ikegami Apr 16 '13 at 07:01
  • What's the `->status_line` of the response? – ikegami Apr 16 '13 at 07:03
  • It is possible that the website decided to block the `WWW-Mechanize` user agent. Did you try specifying a user agent? For example, `my $mec = WWW::Mechanize->new( agent => 'Mozilla' );` – devnull Apr 16 '13 at 07:26
  • @JennyD: There's a public and private key in SSL. The web server uses your public key to encrypt the data you later decrypt with your private key. The server also has a public key that you use to send requests and later the server uses its private key to read that request. There are cases where a server will only accept an SSL connection from a client using a specific public (client) key. To answer your question, yes, I can connect with the command `openssl s_client -connect www.imstores.com:443`. – Francisco Zarabozo Apr 16 '13 at 07:56
  • @AndrzejA.Filip: No, I'm not sure. What I'm sure is that I'm not blocked by IP address because I can get the page from any browser or with a command prompt OpenSSL command. Also, as I mentioned, I can get it if I force WWW::Mechanize to use Net::SSL. The only thing that comes to my mind is that I'm being somehow blocked by whatever key LWP sends when using IO::Socket::SSL instead, but then again, I don't know how to test that. – Francisco Zarabozo Apr 16 '13 at 07:59
  • @ikegami: I can't get to a ->status_line call. The call dies at ->get. – Francisco Zarabozo Apr 16 '13 at 07:59
  • @devnull: Yes, I'm actually cloning the user-agent from Internet Explorer or Chrome (along with all their common headers). There's no difference on changing the user-agent, the results are the same as described in the OP. – Francisco Zarabozo Apr 16 '13 at 08:01
  • @ikegami: About using a proxy, no, I'm not using any proxy. I'm on a direct-to-the-internet ISP connection, my router gets a public IP address, not a LAN one, and I'm the only one connected to my (very simple) router. – Francisco Zarabozo Apr 16 '13 at 08:03
  • Time to stop wondering and stick a network monitor on your system. It's not going to show you the contents of the packets (damn encryption) but you'll be able to see who's not talking. It does sound like there's something the server doesn't like about your request though. http://www.wireshark.org/download.html – Richard Huxton Apr 16 '13 at 08:38
  • Is this a good idea, to bombard their site with 50 simultaneous connections (each of which takes a long time) every 2 hours? This is the kind of thing people will try to block. –  Apr 16 '13 at 08:47
  • @FranciscoZarabozo Ah, I was misled by terminology - public key is something I associate with SSH, while for SSL I'd use "client certificate". Anyhow, you would have to actually choose to send a client cert, it's not something that happens on its own. – Jenny D Apr 16 '13 at 08:52
  • @dan1111: It's taking a long time for some reason with WWW::Mechanize. When visited in a browser, the response is fast. Also, they have an amazing backend, they serve requests to all branches worldwide. Without me doing anything, they actually handle millions of requests per day, and since they don't provide an API to the catalog, I'm pretty sure we're not the only sellers getting the catalog that way. – Francisco Zarabozo Apr 16 '13 at 09:47

1 Answers1

9

Here's the actual reason for the problem: You need to use SSLv3 or TLS1 instead of TLS1.2 to connect to that server. This is probably why it worked when you used Net::SSL first; I believe it tries different ciphers in a way that WWW:Mechanize doesn't.

This is how I found it:

I tried connecting from several different servers, and I find that the ones that worked have an older SSL version. I then checked the difference between what ciphers are used in the versions, and tried connecting with different ciphers.

When I connect using TLS1.2, I get:

$ openssl s_client -connect www.imstores.com:443 -tls1_2
CONNECTED(00000003)
write:errno=54
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 0 bytes and written 322 bytes
---
New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
---

But when I connect with SSLv3 or TLS1, I get:

$ openssl s_client -connect www.imstores.com:443 -tls1
CONNECTED(00000003)
depth=0 /serialNumber=O3gPUAuGGROuHEhlyLaeJfj7SOn6tFTx/C=US/O=www.imstores.com/OU=GT29846307/OU=See www.geotrust.com/resources/cps (c)11/OU=Domain Control Validated - QuickSSL(R) Premium/CN=www.imstores.com
verify error:num=20:unable to get local issuer certificate
[...and so on, including server certificate...]

Exactly how to make WWW:Mechanize use TLS1 or SSLv3 is left as an exercise to the student.

Jenny D
  • 1,225
  • 9
  • 20
  • 1
    Amazing, thank you so much! I specified `ssl_opts => { SSL_version => 'SSLv3'}` in the constructor for WWW::Mechanize, and it's working beautifully again. Really, thank you for taking the time to answer this! :-) – Francisco Zarabozo Apr 16 '13 at 10:08