GitHub raw files - Etag algorithm

Question

Does anyone know how the GitHub ETag when accessing raw content is generated?

As far as I can tell this is not MD5, SHA1 or any common SHA variant;

Example http headers:

HTTP/1.1 200 OK
Server: nginx/1.0.13
Date: Tue, 05 Jun 2012 19:46:08 GMT
Content-Type: text/html; charset=utf-8
Connection: keep-alive
Status: 200 OK
ETag: "aa1da178ae0a43e23ce49a6b8f474738"

The ETag length is 32 characters, suggesting MD5, but this does not match the MD5 checksum of the downloaded file (downloaded using curl).

I am aware that ETags should be treated as opaque identifiers. Still, curious.

@dystroy Aha; I couldn't find any information through Google or on github.com. This is, shall we say, a shot in the dark. — Ishan, Jun 05 '12 at 20:16
i can only confirm that github's `etag` is useless for integrity checking. nowadays it's some sha256 hash with a private hash algorithm — milahu, May 08 '22 at 12:05

score 0 · Answer 1 · answered Jun 05 '12 at 20:15

0

My guess would be they are using the stock nginx etag module.

https://github.com/mikewest/nginx-static-etags/blob/master/ngx_http_static_etags_module.c

answered Jun 05 '12 at 20:15

Tyler Eaves

12,879
1
32
39

1

Thanks, but no dice. That module generates ETags by concatenating some info from the Nginx request, the file size (hex), and the file mtime (hex). C format string "%s_%X_%X". Incidentally, even with the Nginx version specified in that project's README, something's broken.. This is how the ETag looks like: Etag: /redhat-release HTTP/1.1_newline_User-Agent_22_4FCFB809 That's URI path + protocol version + _\n_ + part of a request header + hex file size + hex file mtime. – Ishan Jun 06 '12 at 20:31

GitHub raw files - Etag algorithm

1 Answers1