I have the following simple web server, utilizing Python's http
module:
import http.server
import hashlib
class RequestHandler(http.server.BaseHTTPRequestHandler):
protocol_version = "HTTP/1.1"
def do_PUT(self):
md5 = hashlib.md5()
remaining = int(self.headers['Content-Length'])
while True:
data = self.rfile.read(min(remaining, 16384))
remaining -= len(data)
if not data or not remaining:
break
md5.update(data)
print(md5.hexdigest())
self.send_response(204)
self.send_header('Connection', 'keep-alive')
self.end_headers()
server = http.server.HTTPServer(('', 8000), RequestHandler)
server.serve_forever()
When I upload a file with curl, this works fine:
curl -vT /tmp/test http://localhost:8000/test
Because the file size is known upfront, curl will send a Content-Length: 5
header, so I can know how much should I read from the socket.
But if the file size is unknown, or the client decides to use chunked
Transfer-Encoding, this approach fails.
It can be simulated with the following command:
curl -vT /tmp/test -H "Transfer-Encoding: chunked" http://localhost:8000/test
If I read from the self.rfile
past of the chunk, it will wait forever and hang the client, until it breaks the TCP connection, where self.rfile.read
will return an empty data, then it breaks out of the loop.
What would be needed to extend the above example to support chunked
Transfer-Encoding as well?