1

I am trying to crawl a website using a java program. Until last night it was working perfect, but now the server returns error code 401.

HOWEVER, I can still see the pages that I want through my web browsers. So, I don't know what is wrong? If the server add my IP to black list, so why can I see the URLs through the web browsers? If not, what else can cause 401 error?

Two more points: There is no username and password for this site and authentication is based on my IP.

Also, I tried to change my user agent, and now I get Error 503.

Afshin Moazami
  • 2,092
  • 5
  • 33
  • 55

2 Answers2

0

Likely it could be blocking based on the user agent or request frequency based on your ip address.

Daniel A. White
  • 187,200
  • 47
  • 362
  • 445
  • In which case the server is misusing response code 401, as there is no way that the client could potentially authenticate. See http://tools.ietf.org/html/rfc2616#section-10.4.2 – kdgregory Jan 02 '12 at 20:27
  • There are two issues. First, I can see the pages using the same computer. Also, I use a VPN account, and 6 of my computers who crawl a lot are blocked, but on the last one my program is still working. I got confused. Because all of them using the same VPN, and in the same situation I can browse the html page that I want to see! – Afshin Moazami Jan 02 '12 at 21:01
  • I tried to change my user agent using the following code, and now I received Error 503
    URLcon = url.openConnection(); URLcon.setRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0)");
    – Afshin Moazami Jan 03 '12 at 00:47
0

401 is the HTTP error for Unauthorized and the reason you can see the site from your web browser and not from your java program could be that you set "Remember my user/password" in your browser and, of course, this option is not enabled in your Java Program.

Wilmer
  • 1,025
  • 5
  • 9
  • My authorization is based on my IP not the username and password! And it used to work. Also, as I said the program in working on other computers. – Afshin Moazami Jan 03 '12 at 00:20