urllib2 is throwing an error for an url , while it's opening properly in browser

Question

I am trying to open an url through python like this

  import urllib2
  f = urllib2.urlopen('http://www.futurebazaar.com/Search/laptop')

It's throwing following error

File "C:\Python26\lib\urllib2.py", line 1134, in do_open r = h.getresponse() File "C:\Python26\lib\httplib.py", line 986, in getresponse response.begin() File "C:\Python26\lib\httplib.py", line 391, in begin version, status, reason = self._read_status() File "C:\Python26\lib\httplib.py", line 355, in _read_status raise BadStatusLine(line) httplib.BadStatusLine

But this url is opening via browser.

What does your packet sniffer say? – Ignacio Vazquez-Abrams Mar 08 '11 at 16:07 — Ignacio Vazquez-Abrams, Mar 08 '11 at 16:07

score 5 · Accepted Answer · answered Mar 08 '11 at 16:11

5

The website is broken. If the optional "Accept" header isn't supplied, the site closes the connection without responding; this is invalid behavior.

Workaround:

import urllib2
req = urllib2.Request('http://www.futurebazaar.com/Search/laptop')
req.add_header('Accept', '*/*')
f = urllib2.urlopen(req)

answered Mar 08 '11 at 16:11

Glenn Maynard

55,829
10
121
131

I just loaded the page in my browser and grabbed the HTTP headers it sent, compared it to the headers sent by urllib, then shifted headers one at a time until I found the one that was breaking the page. – Glenn Maynard Mar 08 '11 at 17:14
that's great . Thanks once again for helping me out :) – Jijoy Mar 08 '11 at 17:24
I'm writing a script that syncs contacts from Exchange Server using Suds, and I kept intermittently getting this error. I usually don't like to mess with anything in site-packages, but I patched the suds http implementation with this fix and I haven't seen the error since. Perhaps I'll file a bug report with suds. Thank you! – serialworm Sep 21 '11 at 15:40
I spoke to soon. The error is back. Back to the drawing board. – serialworm Sep 22 '11 at 14:10
Thank you. This is indeed the correct solution to my problem. – serialworm Dec 22 '11 at 13:38

urllib2 is throwing an error for an url , while it's opening properly in browser

1 Answers1

Linked