1

So far this is what I have and every place I've looked it says this code should work but it doesn't.

import socket

mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
mysock.connect(('data.pr4e.org', 80))
mysock.send(b'GET http://data.pr4e.org/romeo.txt HTTP/1.0\n\n')

while True:
    data = mysock.recv(512)
    if ( len(data) < 1 ) :
        break
    print (data)

mysock.close()

This is the output I get back:

b'HTTP/1.1 400 Bad Request\r\nDate: Sun, 25 Nov 2018 19:23:51 GMT\r\nServer: 
Apache/2.4.18 (Ubuntu)\r\nContent-Length: 308\r\nConnection: 
close\r\nContent-Type: text/html; charset=iso-8859-1\r\n\r\n<!DOCTYPE HTML 
PUBLIC "-//IETF//DTD HTML 2.0//EN">\n<html><head>\n<title>400 Bad 
Request</title>\n</head><body>\n<h1>Bad Request</h1>\n<p>Your browser sent a 
request that this server could not understand.<br 
/>\n</p>\n<hr>\n<address>Apache/2.4.18 (Ubuntu) Server at do1.dr-chuck.com 
Port 80</address>\n</body></html>\n'

This is what the example says i should get back:

HTTP/1.1 200 OK
Date: Sun, 14 Mar 2010 23:52:41 GMT
Server: Apache
Last-Modified: Tue, 29 Dec 2009 01:31:22 GMT
ETag: "143c1b33-a7-4b395bea"
Accept-Ranges: bytes
Content-Length: 167
Connection: close
Content-Type: text/plain
But soft what light through yonder window breaks
It is the east and Juliet is the sun
Arise fair sun and kill the envious moon
Who is already sick and pale with grief

Why don't I get the same output?

ARR18
  • 11
  • 3

1 Answers1

0

In some sense, your code works because it successfully can send a request to a server, and you do get a valid result back. You can see that the error message itself comes from the server.

But you do not get the expected result back, so indeed that is a problem. Directly opening http://data.pr4e.org/romeo.txt in the browser works correctly, so let's look a bit further, such as to questions as 400 error header with sockets, which deal with pretty much the same problem.

After some experimenting, it seems that that web server requires Microsoft Windows style end-of-lines: both \r and \n. Just an \n, as in your attempt, does not work – you get that error back. Just an \r makes the server wait indefinitely (or rather "quite a long time and certainly longer than I was prepared to wait for this experiment").

So this simple modification makes your original program work:

mysock.send(b'GET http://data.pr4e.org/romeo.txt HTTP/1.0\r\n\r\n')

and returns, after a few headers, this poetry:

... But soft what bytes through yonder port breaks
It is a request and Http is the Sun ...

(admittedly slightly paraphrased)

On some operating systems (Microsoft Windows is the only one I know), the standard code for end-of-line \n gets automatically expanded to \r\n. So it is reasonable to assume that your working sample code was written and tested on a Windows machine, and its writer never knew (or cared) that this explicit type of line ending is expected by an Apache server.

Jongware
  • 22,200
  • 8
  • 54
  • 100