2

I'm using python 3's urllib.request.urlopen() function. In the past it has worked fine. In fact, I used it a few month's ago for the same program and it worked fine. Now however, the server is logging 301 responses whenever I try to use an api to post on my site. when I use response.getcode() to find the response, it says it's 200.

What would cause this discrepancy? Is there another method to check if my request is failing, or a way to debug it? I personally don't have access to the server, but I can ask the admin to check them for me.

Thanks guys!

Edit

I've found the HTTP requests in wireshark. It sends a POST request, gets a 301 (text/html), then it sends a GET request and gets 200 (application/json). What does this mean? My original request was json (I used urllib.request.urlopen(url,data)). The first response for POST to http://sefaria.org/api/texts/Rashi_on_Berakhot.2a.1.1 is:

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html>
    <head>
    <title>301 Moved Permanently</title>
    </head>
    <body>
        <h1>Moved Permanently</h1>
        <p>The document has moved <a href="http://www.sefaria.org/api/texts/Rashi_on_Berakhot.2a.1.1">here</a>.</p>\n
        <hr>
        <address>Apache/2.4.6 (Ubuntu) Server at sefaria.org Port 80</address>
    </body>
</html>

The GET request for that url responds with 200

Community
  • 1
  • 1
  • 7
    301 is a redirect. Is is possible that `urllib` is auto-following the redirect, and then getting a 200 at the new location? When all else fails, fire up wireshark. – Jonathon Reinhart Oct 13 '14 at 00:50
  • Yeah, that's possible. I found the requests in wireshark, but how do I find the response code. If it's redirecting, how do I find where it redirected? I'm not familiar with wireshark –  Oct 13 '14 at 20:26
  • 1
    It should look something like [this](http://www.tohir.co.za/wp-content/uploads/2010/09/wireshark_filters.png). You should see either `HTTP/1.1 200 OK` or `HTTP/1.1 301 Moved Permanently` or something similar. Then dig in to the protocol-specific layers in the bottom pane. – Jonathon Reinhart Oct 13 '14 at 22:26

2 Answers2

1

So it turns out the I was posting to http://website.com instead of http://www.website.com... whoops.

-1

This is not the expected behaviour when encountering a 301:

If the 301 status code is received in response to a request other than GET or HEAD, the user agent MUST NOT automatically redirect the request unless it can be confirmed by the user, since this might change the conditions under which the request was issued.

(http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html)

You should consider reporting this as a bug to the Python developers.

Klaus D.
  • 13,874
  • 5
  • 41
  • 48
  • I believe it didn't automatically redirect which was my problem. It failed silently. Unless you're saying the follow up GET request was the redirect. –  Oct 15 '14 at 13:13
  • It's not a bug, it's a feature. Redirects are so incredibly common, following redirects by default is the only sensible thing to do. – w0rp Oct 15 '14 at 14:15
  • @w0rp urllib.request.HTTPRedirectHandler.redirect_request() explicitly raises an HTTP error on a 301 response to a POST in Python 3.4.0. So the scenario described above should not happen. It is not a feature. See the code if you still have doubts. – Klaus D. Oct 15 '14 at 15:10