0

I'm trying write an HTTP client in python using the sockets library and can't get the receive part working.

Here is my code:

import socket, sys

class httpBase:
    def __init__(self, host, port=80):
        self.s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        self.s.connect((host, port))
    def send(self, msg):
        self.s.sendall(msg)
    def recive(self):
        data = ''
        while 1:
            Tdata = self.s.recv(128)
            print("||" + data + "|")
            data += Tdata
            if data.decode() == '': break
        return data


http = httpBase('www.google.com')
http.send('GET / HTTP/1.1\r\n\r\n'.encode())
print(http.recive())

The problem is what I get in response with out the print inside of the recive function I get nothing back and the code just waits and I have to force stop it.

Here is the response from google:

|||
||HTTP/1.1 302 Found
Location: http://www.google.co.il/?gfe_rd=ctrl&ei=et8hU-qsFaXY8ge6moCYAg&gws_rd=cr
Cache-Control: private
|
||HTTP/1.1 302 Found
Location: http://www.google.co.il/?gfe_rd=ctrl&ei=et8hU-qsFaXY8ge6moCYAg&gws_rd=cr
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Set-Cookie: PREF=ID=502cb60127440cb1:FF=0:TM=1394728826:LM=1394728826:S=gXXQi28MXZy3d-U7|
||HTTP/1.1 302 Found
Location: http://www.google.co.il/?gfe_rd=ctrl&ei=et8hU-qsFaXY8ge6moCYAg&gws_rd=cr
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Set-Cookie: PREF=ID=502cb60127440cb1:FF=0:TM=1394728826:LM=1394728826:S=gXXQi28MXZy3d-U7; expires=Sat, 12-Mar-2016 16:40:26 GMT; path=/; domain=.google.com
Set-Cookie: NID=67=pnIbo1mi1JNuqB9sxTHn41_sdPg6Za-1nQLp_Wk8|
||HTTP/1.1 302 Found
Location: http://www.google.co.il/?gfe_rd=ctrl&ei=et8hU-qsFaXY8ge6moCYAg&gws_rd=cr
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Set-Cookie: PREF=ID=502cb60127440cb1:FF=0:TM=1394728826:LM=1394728826:S=gXXQi28MXZy3d-U7; expires=Sat, 12-Mar-2016 16:40:26 GMT; path=/; domain=.google.com
Set-Cookie: NID=67=pnIbo1mi1JNuqB9sxTHn41_sdPg6Za-1nQLp_Wk8h3zii3-ibcyo8zdcKg8WmJjbYYr_hCX4NWWvMTCw1dVwTHKtJbo1M6ay977MwX5hswJ6XeadRFIpd5Pe4La2HBRF; expires=Fri, 12-Sep-2014 16:40:26 GMT;|
||HTTP/1.1 302 Found
Location: http://www.google.co.il/?gfe_rd=ctrl&ei=et8hU-qsFaXY8ge6moCYAg&gws_rd=cr
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Set-Cookie: PREF=ID=502cb60127440cb1:FF=0:TM=1394728826:LM=1394728826:S=gXXQi28MXZy3d-U7; expires=Sat, 12-Mar-2016 16:40:26 GMT; path=/; domain=.google.com
Set-Cookie: NID=67=pnIbo1mi1JNuqB9sxTHn41_sdPg6Za-1nQLp_Wk8h3zii3-ibcyo8zdcKg8WmJjbYYr_hCX4NWWvMTCw1dVwTHKtJbo1M6ay977MwX5hswJ6XeadRFIpd5Pe4La2HBRF; expires=Fri, 12-Sep-2014 16:40:26 GMT; path=/; domain=.google.com; HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.|
||HTTP/1.1 302 Found
Location: http://www.google.co.il/?gfe_rd=ctrl&ei=et8hU-qsFaXY8ge6moCYAg&gws_rd=cr
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Set-Cookie: PREF=ID=502cb60127440cb1:FF=0:TM=1394728826:LM=1394728826:S=gXXQi28MXZy3d-U7; expires=Sat, 12-Mar-2016 16:40:26 GMT; path=/; domain=.google.com
Set-Cookie: NID=67=pnIbo1mi1JNuqB9sxTHn41_sdPg6Za-1nQLp_Wk8h3zii3-ibcyo8zdcKg8WmJjbYYr_hCX4NWWvMTCw1dVwTHKtJbo1M6ay977MwX5hswJ6XeadRFIpd5Pe4La2HBRF; expires=Fri, 12-Sep-2014 16:40:26 GMT; path=/; domain=.google.com; HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Date: Thu, 13 Mar 2014 16:40:26 GMT
Server: gws
Content-Length: 277
X-XSS-Protection:|
||HTTP/1.1 302 Found
Location: http://www.google.co.il/?gfe_rd=ctrl&ei=et8hU-qsFaXY8ge6moCYAg&gws_rd=cr
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Set-Cookie: PREF=ID=502cb60127440cb1:FF=0:TM=1394728826:LM=1394728826:S=gXXQi28MXZy3d-U7; expires=Sat, 12-Mar-2016 16:40:26 GMT; path=/; domain=.google.com
Set-Cookie: NID=67=pnIbo1mi1JNuqB9sxTHn41_sdPg6Za-1nQLp_Wk8h3zii3-ibcyo8zdcKg8WmJjbYYr_hCX4NWWvMTCw1dVwTHKtJbo1M6ay977MwX5hswJ6XeadRFIpd5Pe4La2HBRF; expires=Fri, 12-Sep-2014 16:40:26 GMT; path=/; domain=.google.com; HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Date: Thu, 13 Mar 2014 16:40:26 GMT
Server: gws
Content-Length: 277
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Alternate-Protocol: 80:quic

<HTML><HEAD><meta http-equiv="content-type" content=|
||HTTP/1.1 302 Found
Location: http://www.google.co.il/?gfe_rd=ctrl&ei=et8hU-qsFaXY8ge6moCYAg&gws_rd=cr
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Set-Cookie: PREF=ID=502cb60127440cb1:FF=0:TM=1394728826:LM=1394728826:S=gXXQi28MXZy3d-U7; expires=Sat, 12-Mar-2016 16:40:26 GMT; path=/; domain=.google.com
Set-Cookie: NID=67=pnIbo1mi1JNuqB9sxTHn41_sdPg6Za-1nQLp_Wk8h3zii3-ibcyo8zdcKg8WmJjbYYr_hCX4NWWvMTCw1dVwTHKtJbo1M6ay977MwX5hswJ6XeadRFIpd5Pe4La2HBRF; expires=Fri, 12-Sep-2014 16:40:26 GMT; path=/; domain=.google.com; HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Date: Thu, 13 Mar 2014 16:40:26 GMT
Server: gws
Content-Length: 277
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Alternate-Protocol: 80:quic

<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="http://www.g|
Lucas Zamboulis
  • 2,494
  • 5
  • 24
  • 27
fokolo
  • 33
  • 3
  • What happens if you switch the "print" and the "data += Tdata" statements in the while-loop? My guess is that the last print will then show the complete response. Also, doesn't a HTTP response end with '\r\n\r\n', just as your request? Maybe try data.decode().endswith('\r\n\r\n')? – simon Mar 13 '14 at 17:15

1 Answers1

0

This looks like the same problem as in Python Recv() stalling, e.g. using HTTP/1.1 and wondering why the server does not close after the request. See there for details.

Community
  • 1
  • 1
Steffen Ullrich
  • 114,247
  • 10
  • 131
  • 172
  • is it possible doing it without reading headers? i tried checking if it ends with an '\r\n' but not all web servers ends the response with '\r\n' – fokolo Mar 15 '14 at 08:15
  • If you do a HTTP/1.0 request and don't explicitly enable keep-alive you could read until the server closes the connection. Then you get the content with the header on top. But, you still have to deal with 30x redirects etc so I would consider it only an exercise - better to use the established HTTP libraries to deal with all the details. – Steffen Ullrich Mar 15 '14 at 09:12
  • but I want to keep the connection alive and stay in 1.1 – fokolo Mar 15 '14 at 16:02
  • Then you should learn more about the HTTP protocol as defined in RFC2616. Especially look into how the length of the body can be defined, e.g. content-length header, chunked mode or close of connection. – Steffen Ullrich Mar 15 '14 at 18:41