Get file size using python-requests, while only getting the header

Question

I have looked at the requests documentation, but I can't seem to find anything. How do I only request the header, so I can assess filesize?

Blender · Accepted Answer · 2018-02-05T00:45:36.467

103

Send a HEAD request:

>>> import requests
>>> response = requests.head('http://example.com')
>>> response.headers
    {'connection': 'close',
 'content-encoding': 'gzip',
 'content-length': '606',
 'content-type': 'text/html; charset=UTF-8',
 'date': 'Fri, 11 Jan 2013 02:32:34 GMT',
 'last-modified': 'Fri, 04 Jan 2013 01:17:22 GMT',
 'server': 'Apache/2.2.3 (CentOS)',
 'vary': 'Accept-Encoding'}

A HEAD request is like a GET request that only downloads the headers. Note that it's up to the server to actually honor your HEAD request. Some servers will only respond to GET requests, so you'll have to send a GET request and just close the connection instead of downloading the body. Other times, the server just never specifies the total size of the file.

edited Feb 05 '18 at 00:45

answered Jan 11 '13 at 02:32

Blender

289,723
53
439
496

27

Note that not every response will necessarily include a `content-length`--sometimes the response is generated using `Transfer-Encoding: chunked`, in which case there's no way to know how long the response would be unless you actually get the whole response. – Francis Avila Jan 11 '13 at 02:38
3

this is different then the size retrieved using `urllib.urlopen(url).info()['content-length']` , so not exactly what I wanted. – Ciasto piekarz Jul 05 '14 at 09:57
what can i do if there is no content-length in the response headers ? – ghost21blade Apr 11 '21 at 06:54

score 73 · Answer 2 · answered Jun 01 '17 at 06:21

use requests.get(url, stream=True).headers['Content-length']

stream=True means when function returns, only the response header is downloaded, response body is not.

Both requests.get and request.head can get you headers but there's an advantage of using get

get is more flexible, if you want to download the response body after inspecting the length, you can start by simply access the content property or using an iterator which will download the content in chunks
"HEAD request SHOULD be identical to the information sent in response to a GET request." but its not always the case.

here is an example of getting the length of a MIT open course video

MitOpenCourseUrl = "http://www.archive.org/download/MIT6.006F11/MIT6_006F11_lec01_300k.mp4"
resHead = requests.head(MitOpenCourseUrl)
resGet = requests.get(MitOpenCourseUrl,stream=True)
resHead.headers['Content-length'] # output 169
resGet.headers['Content-length'] # output 121291539

[Relevant note](https://requests.readthedocs.io/en/latest/user/advanced/#body-content-workflow) from the documentation of Requests: "If you set `stream` to `True` when making a request, Requests cannot release the connection back to the pool unless you consume all the data or call `Response.close`. This can lead to inefficiency with connections. If you find yourself partially reading request bodies (or not reading them at all) while using `stream=True`, you should make the request within a `with` statement to ensure it’s always closed" — zahypeti, Nov 15 '20 at 21:37

Sadia · Answer 3 · 2021-10-21T16:53:30.830

-1

get the file size -->

file.headers.get('Content-Length')

edited Oct 21 '21 at 16:53

answered Oct 13 '20 at 08:51

Sadia

91
1
4

2

The question was about [Requests](https://requests.readthedocs.io/en/master/) library (see the title and the [python-requests](https://stackoverflow.com/questions/tagged/python-requests) tag) and how to use it on *client-side* to request the headers from an endpoint. This `request.FILES` is different, it looks like part of a framework ([Django?](https://docs.djangoproject.com/en/3.1/ref/request-response/#django.http.HttpRequest.FILES)), and is *server-side* to handle received requests. – Gino Mempin Oct 13 '20 at 12:06

Get file size using python-requests, while only getting the header

3 Answers3

Linked

Related