0

Firstly this is not a duplicate of the multiple questions already asked about just simply getting file size from by requesting headers. https://unix.stackexchange.com/questions/450402/how-to-retrieve-downloadable-file-size-with-curl-command Etc.

I actually started by reading them when i needed to to get the file size of a remote file and determined that running curl with -I/--head should give me a Content-Length: that i can use. Instead for me it gives me a 403 error.

# not posting full curl command for sake of privacy and NSFW link
# but it was derived from firefox's copy as curl command result
curl -I 'https://somedomain.xyz/files/video.mp4' -H 'User-Agent: Mozilla/5.0' -H 'Referer: xyz' -H 'Cookie: __cfduid=xyz' 

HTTP/1.1 403 Forbidden
Date: Wed, 25 Mar 2020 17:41:31 GMT
Content-Type: text/html;charset=iso-8859-1
Connection: keep-alive
Cache-Control: must-revalidate,no-cache,no-store
Cf-Railgun: direct (starting new WAN connection)
CF-Cache-Status: DYNAMIC
Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
Server: cloudflare
CF-RAY: xyz-YUL

Here suggest its because of missing cookie/referer https://unix.stackexchange.com/questions/139698/why-would-curl-and-wget-result-in-a-403-forbidden I tried all that were listed and no difference. Still 403.

Here suggest the site admin could be blocking it HEAD request receives "403 forbidden" while GET "200 ok"?

What is odd to me is that if i try to download the file curl instantly reports the size in the total column, although in human readable form.

curl 'https://somedomain.xyz/files/video.mp4' -H 'User-Agent: Mozilla/5.0' -H 'Referer: xyz' -H 'Cookie: __cfduid=xyz' 

% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                               Dload  Upload   Total   Spent    Left  Speed
0  384M    0 3148k    0     0  1594k      0  0:04:07  0:00:01  0:04:06 1594k

How is curl achieving this and can i utilize it to obtain size rather than requesting the header? Alternatively, is there a way i just get the header request to work?

Thanks!

danile4431
  • 11
  • 3

1 Answers1

1

Solution was to use curl -X GET -Iwith URL and cookies. Referer and user-agent weren't necessary.

For comparison

Header request curl -I 'https://somedomain.xyz/files/video.mp4' -H 'Cookie: __cfduid=xyz'

HTTP/1.1 403 Forbidden
Date: Fri, 27 Mar 2020 17:44:52 GMT
Content-Type: text/html;charset=iso-8859-1
Connection: keep-alive
Cache-Control: must-revalidate,no-cache,no-store
Cf-Railgun: direct (starting new WAN connection)
CF-Cache-Status: DYNAMIC
Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
Server: cloudflare
CF-RAY: XYZ-YUL

GET header request curl -X GET -I 'https://somedomain.xyz/files/video.mp4' -H 'Cookie: __cfduid=xyz'

HTTP/1.1 200 OK
Date: Fri, 27 Mar 2020 17:45:14 GMT
Content-Type: video/mp4
Content-Length: 9895038
Connection: keep-alive
Accept-Ranges: bytes
Cache-Control: public, max-age=31536000
Cf-Railgun: direct (waiting for pending WAN connection)
Expires: Tue, 31 Dec 2030 23:30:45 GMT
Last-Modified: Wed, 04 Mar 2020 16:52:51 CET
X-Ua-Compatible: IE=edge,chrome=1
CF-Cache-Status: DYNAMIC
Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
Server: cloudflare
CF-RAY: XYZ-YUL
danile4431
  • 11
  • 3