Questions tagged [hash-collision]

a situation that occurs when two distinct pieces of data have the same hash value, checksum, fingerprint, or cryptographic digest.

See also the wiki tag.

233 questions
14
votes
2 answers

Chance of a duplicate hash when using first 8 characters of SHA1

If I have an index of URLs, and ID them by the first 8 characters of a SHA1 hash, what is the probability of two different URLs having identical IDs?
zino
  • 1,222
  • 2
  • 17
  • 47
13
votes
5 answers

Are hash collisions with different file sizes just as likely as same file size?

I'm hashing a large number of files, and to avoid hash collisions, I'm also storing a file's original size - that way, even if there's a hash collision, it's extremely unlikely that the file sizes will also be identical. Is this sound (a hash…
SqlRyan
  • 33,116
  • 33
  • 114
  • 199
11
votes
1 answer

How was the hash collision issue in ASP.NET fixed (MS11-100)?

As reported by Slashdot, MS issued an update to ASP.NET to fix the hash collision attack today. (Listed as "Collisions in HashTable May Cause DoS Vulnerability - CVE-2011-3414" on the linked Technet page.) The problem is that the POST data are…
svick
  • 236,525
  • 50
  • 385
  • 514
11
votes
1 answer

How unique are the first 8-12 characters of SHA256 hashes?

Take this hash for example: ba7816bf 8f01cfea 414140de 5dae2223 b00361a3 96177a9c b410ff61 f20015ad It's too long for my purposes so I intend to use a small chunk from it, such as: ba7816bf8f01 ba7816bf Or similar. My intended use case: Video…
11
votes
1 answer

Horrific collisions of adler32 hash

When using adler32() as a hash function, one should expect rare collisions. We can do the exact math of collisions probability, but roughly speaking, since it is a 32-bits hash function, there should not be many collisions on a sample set of a few…
Paul Oyster
  • 1,133
  • 1
  • 12
  • 21
10
votes
5 answers

CHECKSUM() collisions in SQL Server 2005

I've got a table of 5,651,744 rows, with a primary key made of 6 columns (int x 3, smallint, varchar(39), varchar(2)). I am looking to improve the performance with this table and another table which shares this primary key plus an additional column…
Cade Roux
  • 88,164
  • 40
  • 182
  • 265
9
votes
6 answers

md5 hash collisions.

If counting from 1 to X, where X is the first number to have an md5 collision with a previous number, what number is X? I want to know if I'm using md5 for serial numbers, how many units I can expect to be able to enumerate before I get a collision.
John Lewis
  • 712
  • 7
  • 15
8
votes
4 answers

Looking for a good 64 bit hash for file paths in UTF16

I have a Unicode / UTF-16 encoded path. the path delimiters is U+005C '\'. The paths are null-terminated root relative windows file system paths, e.g. "\windows\system32\drivers\myDriver32.sys" I want to hash this path into a 64-bit unsigned…
Dominik Weber
  • 711
  • 5
  • 13
8
votes
2 answers

Understanding cyclic polynomial hash collisions

I have a code that uses a cyclic polynomial rolling hash (Buzhash) to compute hash values of n-grams of source code. If i use small hash values (7-8 bits) then there are some collisions i.e. different n-grams map to the same hash value. If i…
csprajeeth
  • 237
  • 2
  • 10
8
votes
5 answers

How to handle a dict variable with 2^50 elements?

I have to find SHA256 hashes of 2^25 random strings. And then look for collision (using birthday paradox for the last, say, 50 bits of the hash only). I am storing the string:hash pair in a dict variable. Then sorting the variable with values (not…
ritratt
  • 1,703
  • 4
  • 25
  • 45
7
votes
3 answers

Recursive MD5 and probability of collision

I wonder if it is 'safe' to hash a bunch of MD5 hash values together to create a new hash or whether this will in any way increase the probability of collisions. The background: I have a couple of files with dependencies. Each file has an associated…
Janick Bernet
  • 20,544
  • 2
  • 29
  • 55
7
votes
3 answers

What is the maximum number of SHA-1 hashes?

Clearly since SHA-1 hashing produces 40 characters each time, there is a finite number of possible hashes—does anyone know exactly how many?
James
  • 30,496
  • 19
  • 86
  • 113
7
votes
6 answers

Purposely create two files to have the same hash?

If someone is purposely trying to modify two files to have the same hash, what are ways to stop them? Can md5 and sha1 prevent the majority case? I was thinking of writing my own and I figure even if I don't do a good job if the user doesn't know my…
user34537
7
votes
1 answer

How to identify whether or not std::unordered_map has experienced hash collisions?

How to identify whether or not the keys in a std::unordered_map have experienced hash collisions? That is, how to identify if any collision chaining is present?
Greg
  • 8,175
  • 16
  • 72
  • 125
7
votes
3 answers

Open Addressing vs. Separate Chaining

Which hashmap collision handling scheme is better when the load factor is close to 1 to ensure minimum memory wastage? I personally think the answer is open addressing with linear probing, because it doesn't need any additional storage space in case…
user191776
1
2
3
15 16