3

I'm running into a problem in that urllib2.urlopen/requests.post is very occasionally blocking forever on socket.recv and never returning.

I'm trying to find out why this is happening and address that problem, but in the mean time I wondered if there was a way from preventing it blocking forever?

I already know about the timeout optional argument for urllib2.urlopen and socket.setdefaulttimeout but unfortunately for my use case a timeout isn't a solution as I'm uploading files with POST any timeout value I use would risk interrupting a normal file upload.

I've also seen some solutions using signals, but this will have the same problem as using timeouts for me (and is also out the question because I'm not doing this from the main thread).

Is it possible to timeout only if no data has been sent/received through the socket for a certain amount of time perhaps? Or maybe there's some way I can use select / poll to prevent the deadlock / blocking that I'm experiencing?

If there is a solution using select / poll, how would I go about incorporating this into urllib2.urlopen/requests.post?


I also had the idea that if I could send file data through a write type of interface, so I'd control iterating over the file and sending chunks at a time I could probably have enough control to avoid the stalls. I'm not sure how to achieve it though so I asked the question: Upload a file with a file.write interface

UPDATE It seems I've always had a misconception of the meaning of timeout in python, it seems it is actually an idle timeout or read/write timeout (probably the first time I've disagreed with Guido). I always thought it was the max amount of time the response should return in - thank you @tomasz for pointing this out!!

But after adding timeout parameters (tested with both urllib2 and requests) I've come across some really odd and subtle scenarios, possibly mac specific, where the timeout doesn't work correctly which I'm getting more and more inclined to believe is a bug. I'm going to continue to investigate and find out exactly what the issue is. Again thank you tomasz for your help with this!

Community
  • 1
  • 1
GP89
  • 6,600
  • 4
  • 36
  • 64
  • First of all - *why* is it blocking forever? – Code Painters Mar 18 '13 at 16:58
  • @CodePainters I don't know - Ideally I would solve the actual issue, and I'm going to continue to try, but it could be a server side issue (that I dont control) so until I can find the cause I'd like to put in place something as a fallback so the uploads never freeze for eternity and release that as a hotfix in the meantime. – GP89 Mar 18 '13 at 17:02

3 Answers3

6

I believe you could get rid of the hanging states by tweaking your TCP settings on the OS level, but assuming your application is not going to work on a dedicated (and maintainable by you) machine you should seek more general solution.

You asked:

Is it possible to timeout only if no data has been sent/received through the socket for a certain amount of time perhaps

And this is exactly the behaviour that socket.settimeout (or the one passed to urllib2) would give you. In contrary to the timeout based on a SIGALRM (which would terminated even during a slow data transfer), the timeout passed to the socket would occur only if no data has been transmitted during the period defined. A call to socket.send or socket.recv should return a partial count if some, but not all data has been transmitted during the period and urllib2 would then use a subsequent call in order to transmit the remaining data.

Saying this, your POST call could be still terminated somewhere in the middle of the upload if it would be executed in more than one send call and any (but not the first) would block and timed out without sending any data. You gave an impression it wouldn't be handled properly by your application, but I think it should, as it would be similar to a forceful termination of the process or simply a dropped connection.

Have you tested and confirmed that socket.settimeout doesn't solve your problem? Or you just weren't sure how the behaviour is implemented? If the former is correct, please could you give some more details? I'm quite sure you're safe with just setting the timeout as python is simply using the low level BSD socket implementation where the behaviour is as described above. To give you some more references, take a look at setsockopt man page and SO_RCVTIMEO or SO_SNDTIMEO options. I'd expect socket.settimeout to use exactly these function and options.

--- EDIT --- (to provide some test code)

So I was able to get the Requests module and test the behaviour along with urllib2. I've run the server which was receiving blocks of data with increasing intervals between every recv call. As expected, the client timed out when the interval reached the specified timeout. Example:

Server

import socket
import time

listener = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
listener.bind(("localhost", 12346))
listener.listen(1)
sock,_ = listener.accept()

interval = 0.5
while 1:
  interval += 1 # increase interval by 1 second
  time.sleep(interval)
  # Get 1MB but will be really limited by the buffer
  data = sock.recv(1000000)
  print interval, len(data)
  if not data:
    break

Client (Requests module)

import requests

data = "x"*100000000 # 100MB beefy chunk
requests.post("http://localhost:12346", data=data, timeout=4)

Client (urllib2 module)

import urllib2

data = "x"*100000000 # 100MB beefy chunk
urllib2.urlopen("http://localhost:12346", data=data, timeout=4)

Output (Server)

> 1.5 522832
> 2.5 645816
> 3.5 646180
> 4.5 637832 <--- Here the client dies (4.5 seconds without data transfer)
> 5.5 294444
> 6.5 0

Both clients raised an exception:

# urllib2
URLError: timeout('timed out',)

# Requests
Timeout: TimeoutError("HTTPConnectionPool(host='localhost', port=12346): Request timed out. (timeout=4)",)

Everything works as expected! If not passing a timeout as an argument, urllib2 also reacted well on socket.setdefaulttimeout, however Requests did not. It's not a surprise as internal implementation doesn't need to use the default value at all and could simply overwrite it depending on the passed argument or use non-blocking sockets.

I've been running this using the following:

OSX 10.8.3
Python 2.7.2
Requests 1.1.0
tomasz
  • 12,574
  • 4
  • 43
  • 54
  • Looks like a massive slap forehead moment.. From testing `setdefaulttimeout` with a very low value I can see that it has no effect on uploads (that are uploading fine). For some reason I thought that it would timeout after that amount of time from the start. I guess because I've almost always done web requests in the past, and the time between the start of the call and the end of the request is so small that it appears that the timeout is acting as a limit on the time of the whole operation and not a read/write timeout. Thank you for pointing that out! – GP89 Mar 25 '13 at 10:50
  • Also, do you know what errno it will raise? `errno.ETIMEDOUT` I would guess but looking up `SO_RCVTIMEO` and `SO_SNDTIMEO` it looks like it might be something different. – GP89 Mar 25 '13 at 10:51
  • I've been testing this with `requests.put` and settings the `socket.setdefaulttimeout` doesn't seem to work, and passing a `timeout` kwarg means I can't upload anything - I just continually get a socket error and `resource temporarily unavailable`. Any idea? – GP89 Mar 25 '13 at 13:19
  • My previous comment was wrong, `urllib2` will raise `socket.timeout` instead of `urllib2.URLError`. (I was mislead here by the documentation and removed the comment). I've been testing `socket.setdefaulttimeout` and `timeout` argument to `urllib2.urlopen` and behaviour was as expected. I'm not familiar with `Requests` module, but would be able to take a look later today. _Resource temporarily unavailable_ suggests underlying socket returns EAGAIN and is in non-blocking mode (could happen if you pass `0` as a timeout). Are you getting this error straight away or after the defined period? – tomasz Mar 25 '13 at 13:57
  • Thanks for your help! I'm continuing testing also. I think the EAGAIN thing is mac specific (http://bugs.python.org/issue8493) so might be why you're not seeing it (assuming you're not using a mac), and I get it right away, scraping the http request I see I'm sending 0 bytes. How are you testing the `socket.setdafaulttimeout` by the way? I ideally want to stop mid upload and see if the exceptions get raised but I've not thought of a way of doing that currently. – GP89 Mar 25 '13 at 15:18
  • Interesting. I've been testing on Linux, but have OSX at home, so will take a look. To test `socket.setdefaulttimeout` I've written a very simple python script that opens a socket for listening, but never calls `accept` and is sleeping forever instead. Everything trying to connect to this socket should time out. To time out in a middle of the upload will be a little bit trickier as you'll probably need to try to upload a big chunk of data and never call a second `recv` on a server side. I haven't been testing this. – tomasz Mar 25 '13 at 15:33
  • Yea, I set up a socket locally like you suggested and it looks like I was slightly wrong. I'm sending the headers and roughly 664KB of file data, then I get the EAGAIN "resource temporarily available" which from that bug link seems to be raised when the network buffer is full on mac. So the socket doesn't have a chance to timeout as this error is always raised first. This could potentially be just as bad as the original problem of the app hanging. I guess this EAGAIN wouldn't normally be a problem with web requests, only if the network buffer is filled like I'm doing with uploads – GP89 Mar 25 '13 at 19:46
  • @GP89 I've added some sample code to my original response. I'm running this on OSX can't see any issues. The bug in the link you sent seem to be invalid (by looking at the last comment) or at least suggests the error is raised as a timeout (in this case it wouldn't be a problem). Please take a look at my examples and see how they compare to yours. Maybe if you isolate and well define the problem you should open a different question to attract some attention? Nobody really is reading through all these comments I guess :) – tomasz Mar 25 '13 at 23:55
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/26896/discussion-between-gp89-and-tomasz) – GP89 Mar 26 '13 at 00:13
1

You mention that the indefinite blocking happens "very occasionally", and that you're looking for a fallback to avoid failing file uploads when this happens. In this case, I recommend using a timeout for your post calls, and retrying the post in case of timeouts. All this requires is a simple for loop, with a break if anything happens other than a timeout.

Of course, you should log a warning message when this happens, and monitor how often this happens. And you should try to find the underlying cause of the freezes (as you mentioned you intend to).

taleinat
  • 8,441
  • 1
  • 30
  • 44
  • Looks like I can use a `timeout`, you're right. I always thought the timeout was the max time that the call would take, which I didn't think I could accurately workout (it would suck if the timeout kicked in a few MB short of a few GB of upload and the user had to start over), but it seems that my understanding of the timeout was wrong, and it does act as a read/write timeout which is what I was looking for! – GP89 Mar 25 '13 at 10:59
0

One of possible decisions - you could nest your urllib2 request to a block with ALRM signal handling, or put it into a thread with forced stopping on timeout. This will force stopping your request by timeout , in spite of any internal urllib2 problem, old question about this case: Python: kill or terminate subprocess when timeout

Community
  • 1
  • 1
moonsly
  • 612
  • 6
  • 11
  • But that's not what the OP needs: "Is it possible to timeout only if no data has been sent/received through the socket for a certain amount of time perhaps?" – Code Painters Mar 18 '13 at 17:22
  • Yea, I can't use signals because I'm not uploading from the main thread and I think it would work just as well as specifying a timeout anyway (which wont work for me). And the idea of using a thread will effectively be the same as specifying a timeout. – GP89 Mar 18 '13 at 17:28
  • old question: http://stackoverflow.com/questions/5686490/detect-socket-hangup-without-sending-or-receiving seems to be useful in your case – moonsly Mar 18 '13 at 18:22
  • @moonsly I'm not sure, if `urllib2.urlopen` or `requests.post` is blocking how would I go about sending data down the socket to check if the server has stopped listening? – GP89 Mar 20 '13 at 20:22