How to check formatting of a SHA-1 message-digest

Question

I need some basic validation (sanitation checks) to determine if some input is a valid SHA1 sum or just a (random) string. If possible with simple parsing rules or a Regex.

Are there any rules to what a SHA1 sum should adhere? I cannot find any, but from quick tests, all seem to be hexadecimal and around 40 characters long[1].

I am not interested in tests that prove whether or not the SHA-1 sum was made in a secure, properly random or other manner. Just that the format is correct.

I am also not interested in testing that the digest is an actual representation of some message; Just that it has the format of digest in the first place.

For the curious: this is for an application where I build avatars for users based on a.o. their uuid. I don't, however, want to place those uuids in the URL, but obfuscate them a little. So instead of avatars/baa4833d-b962-4ab1-87c5-283c9820eac4.png, we request avatars/5f2a13cb1d84a2e019842cdb8d0c8b03c9e1e414.png. Where 5f2a... is e.g. Digest::SHA1.hexdigest(uuid + "secrect").

On the receiving side, I am adding some basic protection that sends back a 400 bad request whenever something is obviously false. Such as avatars/haxor.png or avatars/traversal../../../../attempt.png. Note that this is a very much simplified example.

[1] Two tests with different outcome:

Using sha1sum on Ubuntu Linux:

$ echo "hello" | sha1sum | cut -d" " -f1 | wc -c
41

using Ruby's Digest:

Digest::SHA1.hexdigest("hello").length
=> 40

Edit: turns out this is me, being stupid, wc-c includes the newline, as kennytm points out in the comments. Still: is it safe to assume it will be 40 characters, always?

@kennytm: so is it always 40, then? Is that set in a standard somewhere? — berkes, Mar 30 '17 at 08:40
[SHA-1 generates 160-bit hash](https://en.wikipedia.org/wiki/SHA-1) so the hexadecimal representation is always 160 / 4 = 40 digits long. — kennytm, Mar 30 '17 at 08:42
@kennytm, so that sounds like `[a-f0-9]{40}` would be enough to validate the format, not? — berkes, Mar 30 '17 at 08:49

score 1 · Accepted Answer · edited Mar 30 '17 at 08:53

1

SHA-1 has a 160 bits digest

160 bits rendered is 160 / 8 = 20 bytes.
20 bytes rendered in hexadecimal format has a length of 40 chars (digits), two chars for each byte. Digits can be [0-9a-f]

So the following regex should correctly validate the Sha1sum rendered as a string in hexadecimal format:

/^[0-9a-f]{40}$/

edited Mar 30 '17 at 08:53

berkes

26,996
27
115
206

answered Mar 30 '17 at 08:51

Matthijs

1,315
12
12

How to check formatting of a SHA-1 message-digest

1 Answers1