Python: httplib getresponse issues many recv() calls

Question

getresponse issues many recv calls while reading header of an HTML request. It actually issues recv for each byte which results in many system calls. How can it be optimized?

I verified on an Ubuntu machine with strace dump.

sample code:

conn = httplib.HTTPConnection("www.python.org")
conn.request("HEAD", "/index.html")
r1 = conn.getresponse()

strace dump:

sendto(3, "HEAD /index.html HTTP/1.1\r\nHost:"..., 78, 0, NULL, 0) = 78
recvfrom(3, "H", 1, 0, NULL, NULL)      = 1
recvfrom(3, "T", 1, 0, NULL, NULL)      = 1
recvfrom(3, "T", 1, 0, NULL, NULL)      = 1
recvfrom(3, "P", 1, 0, NULL, NULL)      = 1
recvfrom(3, "/", 1, 0, NULL, NULL)      = 1
...

score 3 · Answer 1 · answered Jan 25 '13 at 11:28

3

r = conn.getresponse(buffering=True)

On Python 3.1+ there is no buffering parameter (it is default).

answered Jan 25 '13 at 11:28

jfs

399,953
195
994
1,670

I'm getting a lot of recvfrom() reading single byte each using urllib2.urlopen. I found that urllib2 uses HTTPConnection inside, but no arguments are being passed in getreponse() call. Is there any way to get rid of enormous amount of tiny recvfrom()'s ? – Andrei Belov Sep 01 '14 at 11:01
@AndreiBelov Have you tried to use HTTPConnection directly and pass `buffering=True` to its getresponse() method? – jfs Sep 01 '14 at 11:07
@J.F.Sebastian unfortunately, it's not an option. I've just figured out that urllib2 reads 1-byte chunks only for response headers, i.e. when server starts to send body in chunked encoding, things look better: 323 recvfrom(3, "\r", 1, 0, NULL, NULL) = 1 324 recvfrom(3, "\n", 1, 0, NULL, NULL) = 1 325 recvfrom(3, " \n – Andrei Belov Sep 01 '14 at 13:44

Python: httplib getresponse issues many recv() calls

1 Answers1