Do similar passwords have similar hashes?

Question

Our computer system at work requires users to change their password every few weeks, and you cannot have the same password as you had previously. It remembers something like 20 of your last passwords. I discovered most people simply increment a digit at the end of their password, so "thisismypassword1" becomes "thisismypassword2" then 3, 4, 5 etc.

Since all of these passwords are stored somewhere, I wondered if there was any weakness in the hashes themselves, for standard hashing algorithms used to store passwords like MD5. Could a hacker increase their chances of brute-forcing the password if they have a list of hashes of similar passwords?

If you have to change your password every few weeks, and can't reuse a password for 20 attempts, it's not a wonder users just increment. It sounds to me, like if you need passwords changed that often to ensure security, then passwords aren't the means by which you should be ensuring security. — CaffGeek, Apr 21 '10 at 14:21
Note that although there's no similarity in the hash, changing passwords only adds any security if it prevents attackers who somehow find out an old password from using it. For instance if it takes an attacker a month to perform a dictionary/brute force attack given that they know the hash, then changing your password every month means that attack is useless. *Unless*, the password is similar to the old one, in which case the attacker uses the old password as a starting-point for a new brute force attack on the new hash, which completes in microseconds and they kick your ass. — Steve Jessop, Apr 21 '10 at 15:07
I'm pretty sure that in practice, enforcing frequent password changes reduces security, not increases it, since it makes it harder for users to choose and remember good passwords. It's based on the ancient practice of the "password of the day", where it's necessary to change the password often because *your entire army knows it*, and it is guaranteed to leak to the enemy in due course. Computer passwords should be know to 1 person only. — Steve Jessop, Apr 21 '10 at 15:09
@Steve Your answer conflicts with the other answers who say the result will be radically different, so I am not sure who is correct. As for your comments about reducing security, I am inclined to agree. They are standard features of Windows Server XXXX though, to force users to change passwords regularly and prevent them switching their password back and forth immediately to keep the same password — NibblyPig, Apr 21 '10 at 15:28
I don't intend to conflict with those other answers. They say there is no similarity in outputs when you hash similar inputs, which is true. However, they don't comment in general on the security of the resulting system. If an attacker somehow figures out that your password was "thisismypassword3" a few months ago, his first few guesses at your current password will be "thisismypassword4", etc. So it does completely defeat the purpose of changing your password if you new password is one of those, but for reasons that have nothing to do with the hash. — Steve Jessop, Apr 21 '10 at 15:41
@Steve I see, sorry I misinterpreted your answer. I assumed you meant that if they knew your old password, it would be quick to brute force the new one by comparing hashes, sorry. Of course, you are correct. — NibblyPig, Apr 21 '10 at 15:56
I've turned all these comments into an answer now, since they've gone on a bit :-) — Steve Jessop, Apr 21 '10 at 15:58

score 11 · Answer 1 · answered Apr 21 '10 at 14:11

11

With a good hash algorithm, similar passwords will get distributed across the hashes. So similar passwords will have very different hashes.

You can try this with MD5 and different strings.

"hello world" - 5eb63bbbe01eeed093cb22bb8f5acdc3
"hello  world" - fd27fbb9872ba413320c606fdfb98db1

answered Apr 21 '10 at 14:11

Oded

489,969
99
883
1,009

Are the standard algorithms used in .NET or Windows 'good' in this way? Although to me those hashes look completely dissimilar, to a computer that may not be the case. – NibblyPig Apr 21 '10 at 15:25

score 10 · Answer 2 · answered Apr 21 '10 at 14:11

10

It depends on the hashing algorithm. If it is any good, similar passwords should not have similar hashes.

answered Apr 21 '10 at 14:11

kemiller2002

113,795
27
197
251

2

Yes, and if you use the hash for cryptographic purposes, it better have that property. (Note that there are other domains where you *want* similar inputs to have similar hashes.) – Jörg W Mittag Apr 21 '10 at 14:46
Good ones do yes. You can always create a hash algorithm that doesn't. It varies from algorithm to algorithm. – kemiller2002 Apr 21 '10 at 16:07

Steve Jessop · Accepted Answer · 2013-09-09T07:58:07.543

Do similar passwords have similar hashes?

No.

Any similarity, even a complex correlation, would be considered a weakness in the hash. Once discovered by the crypto community it would be published, and enough discovered weaknesses in the hash eventually add up to advice not to use that hash any more.

Of course there's no way to know whether a hash has undiscovered weaknesses, or weaknesses known to an attacker but not published, in which case most likely the attacker is a well-funded government organization. The NSA certainly is in possession of non-public theoretical attacks on some crypto components, but whether those attacks are usable is another matter. GCHQ probably is. I'd guess that a few other countries have secret crypto programs with enough mathematicians to have done original work: China would be my first guess. All you can do is act on the best available information. And if the best available information says that a hash is "good for crypto", then one of the things that means is no usable similarities of this kind.

Finally, some systems use weak hashes for passwords -- either due to ignorance by the implementer or legacy. All bets are off for the properties of a hashing scheme that either hasn't had public review, or else has been reviewed and found wanting, or else is old enough that significant weaknesses have eventually been found. MD5 is broken for some purposes (since there exist practical means to generate collisions) but not for all purposes. AFAIK it's OK for this, in the sense that there is no practical pre-image attack, and having a handful of hashes of related plaintexts is no better than having a handful of hashes of unrelated plaintexts. But for unrelated reasons you shouldn't really use a single application of any hash for password storage anyway, you should use multiple rounds.

Could a hacker increase their chances of brute-forcing the password if they have a list of hashes of similar passwords?

Indirectly, yes, knowing that those are your old passwords. Not because of any property of the hash, but suppose the attacker manages to (very slowly) brute-force one or more of your old passwords using those old hashes, and sees that in the past it has been "thisismypassword3" and "thisismypassword4".

Your password has since changed, to "thisismypassword5". Well done, by changing it before the attacker cracked it, you have successfully ensured that the attacker did not recover a valuable password! Victory! Except it does you no good, since the attacker has the means to guess the new one quickly anyway using the old password(s).

Even if the attacker only has one old password, and therefore cannot easily spot a trend, password crackers work by trying passwords which are similar to dictionary words and other values. To over-simplify a bit, it will try the dictionary words first, then strings consisting of a word with one extra character added, removed or changed, then strings with two changes, and so on.

By including your old password in the "other values", the attacker can ensure that strings very similar to it are checked early in the cracking process. So if your new password is similar to old ones, then having the old hashes does have some value to the attacker - reversing any one of them gives him a good seed to crack your current password.

So, incrementing your password regularly doesn't add much. Changing your password to something that's guessable from the old password puts your attacker in the same position as they'd be in if they knew nothing at all, but your password was guessable from nothing at all.

The main practical attacks on password systems these days are eavesdropping (via keyloggers and other malware) and phishing. Trying to reverse password hashes isn't a good percentage attack, although if an attacker has somehow got hold of an /etc/passwd file or equivalent, they will break some weak passwords that way on the average system.

Accepted, this question has the most detail and mentions a lot of things I missed while mulling over the issue. Thanks! — NibblyPig, Apr 22 '10 at 13:51

score 7 · Answer 4 · answered Apr 21 '10 at 14:12

7

The whole point of a cryptographic hash is that similar passwords would absolutely not create similar hashes.

More importantly, you would most likely salt the password so that even the same passwords do not produce the same hash.

answered Apr 21 '10 at 14:12

Robin Day

100,552
23
116
167

score 4 · Answer 5 · answered Apr 21 '10 at 14:12

4

It depends on the hash algorithm used. A good one will distribute similiar inputs to disparate outputs.

answered Apr 21 '10 at 14:12

Mitch Wheat

295,962
43
465
541

Surely, a good hash algorithm will distribute *all* inputs to disparate outputs if possible? – RCIX Apr 22 '10 at 05:34
@RCIX: "all inputs to disparate outputs" - that is only possible if the resulting hash is as wide as the input space! The reason for using a hash in the first place is to create a result that is much smaller than the input. – Mitch Wheat Apr 22 '10 at 05:53
I think you're both referring to preimage resistance and collision resistance: http://en.wikipedia.org/wiki/Cryptographic_hash_function – jasonh Apr 22 '10 at 06:19
I suppose i should rephrase it as, a good hash algorithm will convert as many inputs to disparate outputs as possible. – RCIX Apr 22 '10 at 06:21

score 4 · Answer 6 · answered Apr 21 '10 at 14:17

Different Inputs may result in the same Hash this is what is called a hash collision.

Check here:

http://en.wikipedia.org/wiki/Collision_%28computer_science%29

Hash colisions may be used to increase chances of a successfull brute force attack, see:

http://en.wikipedia.org/wiki/Birthday_attack

score 4 · Answer 7 · answered Apr 21 '10 at 14:20

4

To expand on what others have said, a quick test shows that you get vastly different hashes with small changes made to the input.

I used the following code to run a quick test:

<?php
for($i=0;$i<5;$i++)
        echo 'password' . $i . ' - ' .md5('password' . $i) . "<br />\n";
?>

and I got the following results:

password0 - 305e4f55ce823e111a46a9d500bcb86c
password1 - 7c6a180b36896a0a8c02787eeafb0e4c
password2 - 6cb75f652a9b52798eb6cf2201057c73
password3 - 819b0643d6b89dc9b579fdfc9094f28e
password4 - 34cc93ece0ba9e3f6f235d4af979b16c

answered Apr 21 '10 at 14:20

UnkwnTech

88,102
65
184
229

1

To me they look different but ABC looks different from ZYX yet the algorithm to convert them is very simple. – NibblyPig Apr 21 '10 at 15:31
@Chad what can I say, its a programming site, wanna' know an answer, and you CAN test for it. DO SO! – UnkwnTech Apr 21 '10 at 16:46
The reason this is wrong, is that glancing at a few outputs by eye and failing to spot any correlation yourself, is absolutely not a valid check of whether there exists some correlation among them, that an attacker could use to improve a cryptanalytic attack. All that means is that an attacker with your knowledge, who spends 10 seconds on the problem, has no leverage. This is not the attack scenario that the questioner is concerned about, and you cannot rule out the existence of a related-plaintext attack with this "test". – Steve Jessop Sep 09 '13 at 07:44

score 1 · Answer 8 · answered Apr 21 '10 at 14:11

1

Short answer, no!

The output of a hash function varies greatly even if one character is increased.

But this is only if you want to break the hashfunction itself.

Of course, it is bad practice since it makes bruteforcing easier.

answered Apr 21 '10 at 14:11

Henri

5,065
23
24

score 1 · Answer 9 · answered Apr 21 '10 at 14:12

1

No, if you check the password even slightly it produces completely new hash.

answered Apr 21 '10 at 14:12

Vonder

4,033
15
44
61

score 1 · Answer 10 · answered Apr 21 '10 at 14:13

As a general rule, a "good hash" will not hash two similar (but unequal) strings to similar hashes. MD5 is good enough that this isn't a problem. However, there are "rainbow tables" (essentially password:hash pairs) for quite a few common passwords (and for some password hashes, the traditional DES-based unix passwords, for example) full rainbow tables exist.

Do similar passwords have similar hashes?

10 Answers10