I have this simple URL which I want to call from my python script: http://test.my-site.com/bla-blah/createAccount (I changed some letters due to privacy, all special characters etc are exactly the same)
import urllib2
def myfunc(self, url):
result = urllib2.urlopen(url).read()
# HTTP Error 400: Bad Request
When I call the above URL, I get the error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 406, in open
response = meth(req, response)
File "/usr/lib/python2.7/urllib2.py", line 519, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.7/urllib2.py", line 444, in error
return self._call_chain(*args)
File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 527, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 400: Bad Request
I do not think it has something to do with quotes (and/or whitespaces obviously). When I call the URL http://test.my-site.com/bla-blah/listAccounts instead, it works fine, and in result is the exact same text I get when I call the URL in my browser. Of course I checked the first URL via browser, and it works fine.
Any idea what this might be?
Edit for clarification:
These two URLs should be callable without any further parameters or query strings, right as they stand there above. The site then should show something like "error: parameters missing". This does happen when I call the URLs in my browser or via curl in bash. Just the python module is making problems.
Edit2 (Also changed to post title to match the situation better)
Thanks, you were right: If I do curl -v 'http://test.my-site.com/bla-blah/createAccount'
, I get the following:
* About to connect() to <blackened> port 80 (#0)
* Trying 193.46.215.110... connected
> GET <blackened> HTTP/1.1
> User-Agent: curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3
> Host: <blackened>
> Accept: */*
>
< HTTP/1.1 400 Bad Request
< content-language: en-US
< server: <blackened>
< date: Thu, 04 Dec 2014 07:20:15 GMT
< set-cookie: beng_proxy_session=e2e037e7e79c1b03; HttpOnly; Path=/; Version=1; Discard
< p3p: CP="CAO PSA OUR"
< content-length: 234
<
error: parameter x missing
error: parameter y missing
* Connection #0 to host <blackened> left intact
* Closing connection #0
As you can see, there is a HTTP header error. But curl (and browser) continue printing the site-body ("parameter missing..."), but python urllib stops after seeing the header error and does not print the body. (The header error btw is something that is sent by the server application, I guess. So this has nothing to do with python urllib) So we are one step closer, but I still need to see the body even if there is an error, because I have to know (and show) what exactly went wrong. But just now I was able to find a solution to that:
try:
response = urllib2.urlopen("http://test.my-site.com/bla-blah/createAccount")
contents = response.read()
print("success: %s" % contents)
except urllib2.HTTPError as e:
contents = e.read()
print("error: %s" % contents)
This way I get the body of the site, no matter if error or success.
(Btw, this is the post I got the solution from: Overriding urllib2.HTTPError or urllib.error.HTTPError and reading response HTML anyway)
Thank you very much!