I have looked at the requests documentation, but I can't seem to find anything. How do I only request the header, so I can assess filesize?
Asked
Active
Viewed 7.2k times
3 Answers
103
Send a HEAD request:
>>> import requests
>>> response = requests.head('http://example.com')
>>> response.headers
{'connection': 'close',
'content-encoding': 'gzip',
'content-length': '606',
'content-type': 'text/html; charset=UTF-8',
'date': 'Fri, 11 Jan 2013 02:32:34 GMT',
'last-modified': 'Fri, 04 Jan 2013 01:17:22 GMT',
'server': 'Apache/2.2.3 (CentOS)',
'vary': 'Accept-Encoding'}
A HEAD request is like a GET request that only downloads the headers. Note that it's up to the server to actually honor your HEAD request. Some servers will only respond to GET requests, so you'll have to send a GET request and just close the connection instead of downloading the body. Other times, the server just never specifies the total size of the file.

Blender
- 289,723
- 53
- 439
- 496
-
27Note that not every response will necessarily include a `content-length`--sometimes the response is generated using `Transfer-Encoding: chunked`, in which case there's no way to know how long the response would be unless you actually get the whole response. – Francis Avila Jan 11 '13 at 02:38
-
3this is different then the size retrieved using `urllib.urlopen(url).info()['content-length']` , so not exactly what I wanted. – Ciasto piekarz Jul 05 '14 at 09:57
-
what can i do if there is no content-length in the response headers ? – ghost21blade Apr 11 '21 at 06:54
73
use requests.get(url, stream=True).headers['Content-length']
stream=True
means when function returns, only the response header is downloaded, response body is not.
Both requests.get
and request.head
can get you headers but there's an advantage of using get
get
is more flexible, if you want to download the response body after inspecting the length, you can start by simply access thecontent
property or using aniterator
which will download the content in chunks- "HEAD request SHOULD be identical to the information sent in response to a GET request." but its not always the case.
here is an example of getting the length of a MIT open course video
MitOpenCourseUrl = "http://www.archive.org/download/MIT6.006F11/MIT6_006F11_lec01_300k.mp4"
resHead = requests.head(MitOpenCourseUrl)
resGet = requests.get(MitOpenCourseUrl,stream=True)
resHead.headers['Content-length'] # output 169
resGet.headers['Content-length'] # output 121291539

watashiSHUN
- 9,684
- 4
- 36
- 44
-
[Relevant note](https://requests.readthedocs.io/en/latest/user/advanced/#body-content-workflow) from the documentation of Requests: "If you set `stream` to `True` when making a request, Requests cannot release the connection back to the pool unless you consume all the data or call `Response.close`. This can lead to inefficiency with connections. If you find yourself partially reading request bodies (or not reading them at all) while using `stream=True`, you should make the request within a `with` statement to ensure it’s always closed" – zahypeti Nov 15 '20 at 21:37
-1
get the file size -->
file.headers.get('Content-Length')

Sadia
- 91
- 1
- 4
-
2The question was about [Requests](https://requests.readthedocs.io/en/master/) library (see the title and the [python-requests](https://stackoverflow.com/questions/tagged/python-requests) tag) and how to use it on *client-side* to request the headers from an endpoint. This `request.FILES` is different, it looks like part of a framework ([Django?](https://docs.djangoproject.com/en/3.1/ref/request-response/#django.http.HttpRequest.FILES)), and is *server-side* to handle received requests. – Gino Mempin Oct 13 '20 at 12:06