0

I'm creating a forum status grabber. But I want to use sockets to grab the data from the forum. So I am writing to the socket a header. But there is 400 error. So I made a test script to do checking but still I get errors.

import socket
s = socket.socket()
s.connect(("198.57.47.136", 80))
header = """
GET / HTTP/1.1\r\n
Host: httn
Connection: keep-alive\r\n
Cache-Control: max-age=0\r\n
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8\r\n
User-Agent: Mozilla/5.0 (Windows NT 6.3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36 OPR/26.0.1656.60\r\n
Accept-Encoding: gzip, deflate, lzma, sdch\r\n
Accept-Language: en-GB,en-US;q=0.8,en;q=0.6\r\n
"""
s.send(header)
print s.recv(10000)

Which returns

HTTP/1.1 400 Bad Request
Server: nginx
Date: Thu, 01 Jan 2015 21:43:47 GMT
Content-Type: text/html
Content-Length: 166
Connection: close
<html>
<head><title>400 Bad Request</title></head>
<body bgcolor="white">
<center><h1>400 Bad Request</h1></center>
<hr><center>nginx</center>
</body>
</html>
jww
  • 97,681
  • 90
  • 411
  • 885
54.224.239.54
  • 77
  • 1
  • 8

2 Answers2

1

Probably the problem is with the format of your request.

First, your HTTP request starts with a line feed. Also, the lines in a HTTP request must be separated by \r\n, while Python multiline strings only have \n. But since you have literals \r\n in some of them (not all) it is a mess.

Finally, the header must end with an empty line.

My advice is to use a list of strings without any line ending, and then join them:

header_lines = [
 "GET / HTTP/1.1",
 "Host: httn",
 "Connection: keep-alive",
 ...
]

header = "\r\n".join(header_lines) + "\r\n\r\n"

Note that since str.join() does not add a final EOL, you have to add two of them to include the mandatory empty line.

rodrigo
  • 94,151
  • 12
  • 143
  • 190
  • I've just tried it. It doesn't even return a response. – 54.224.239.54 Jan 01 '15 at 23:20
  • @54.224.239.54: My bad, you have to have two `"\r\n"` at the end. That's because `str.join()` does not terminate the last line with an EOL. I'm correcting the answer. – rodrigo Jan 02 '15 at 01:25
1

A multi-line Python string adds an extra \n for every line. Note:

>>> s = '''
... Host: rile5.com\r\n
... '''
>>>
>>> s
'\nHost: rile5.com\r\n\n'

There is an extra first line and two \n for each line. This works, but not on the original IP address you used:

import socket
s = socket.socket()
s.connect(("rile5.com", 80))
header = b"""\
GET / HTTP/1.1\r
Host: rile5.com\r
Connection: keep-alive\r
Cache-Control: max-age=0\r
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8\r
User-Agent: Mozilla/5.0 (Windows NT 6.3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36 OPR/26.0.1656.60\r
Accept-Encoding: gzip, deflate, lzma, sdch\r
Accept-Language: en-GB,en-US;q=0.8,en;q=0.6\r
\r
"""
s.sendall(header)
print(s.recv(10000))

Note the extra slash after the opening quotes. This suppresses the initial newline.

header = b"""\

Also note the extra blank line at the end. This is required so the server knows the header is complete.

Why not just use urllib.request?

Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251