Python HTTP always 301 using sockets

Question

I write a simple program to get some information from a website using python. but when I run the code below, it always returns the following 301 info. At the same time, my browser can visit the website easily. Please tell me why this happens and how to improve my code to avoid the problem.

HTTP/1.1 301 Moved Permanently
Date: Tue, 28 Aug 2018 14:26:20 GMT
Server: Apache
Referrer-Policy: origin-when-cross-origin
Location: https://www.ncbi.nlm.nih.gov/
Content-Length: 237
Connection: close
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved <a         href="https://www.ncbi.nlm.nih.gov/">here</a>.</p>
</body></html>

import socket

searcher = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
searcher.connect(("www.ncbi.nlm.nih.gov", 80))
cmd = "GET https://www.ncbi.nlm.nih.gov/ HTTP/1.0\r\n\r\n".encode()
searcher.send(cmd)
while True:
    data = searcher.recv(512)
    if len(data)<1: break
    print(data.decode())
searcher.close()

Roomm · Accepted Answer · 2018-08-28T15:34:15.773

1

You recieve a 301 because site is redirecting to https site.

I don't know if using sockets is mandatory, but if not you can use requests, it's a easy-to-use lib for doing http requests:

import requests

req = requests.get("http://www.ncbi.nlm.nih.gov")
html = req.text

With this, the 301 is performed anyway but it's transparent.

If you want to do it with sockets, you should add the "ssl layer" manually:

import socket
import ssl

searcher = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
searcher.connect(("www.ncbi.nlm.nih.gov", 443))
searcher = ssl.wrap_socket(searcher, keyfile=None, certfile=None, server_side=False, cert_reqs=ssl.CERT_NONE, ssl_version=ssl.PROTOCOL_SSLv23)
cmd = "GET https://www.ncbi.nlm.nih.gov/ HTTP/1.0\r\n\r\n".encode()
searcher.send(cmd)
while True:
    data = searcher.recv(512)
    if len(data) < 1: break
    print(data.decode())
searcher.close()

edited Aug 28 '18 at 15:34

answered Aug 28 '18 at 14:42

Roomm

905
11
23

1

Thank you very much! It worked by using requests. But I'm still curious about how to use python socket to reach https sites. Is that realizable? – AngusMurphy Aug 28 '18 at 15:24
2

I edited the answer adding the example code for doing it with sockets – Roomm Aug 28 '18 at 15:34
1

Thank you. You answer perfectly solved my problem and I'll take your answer. – AngusMurphy Aug 29 '18 at 08:23

Python HTTP always 301 using sockets

1 Answers1