19

Just curious but what is the probability of matching a Guid?

Say a Guid from SQL server: 5AC7E650-CFC3-4534-803C-E7E5BBE29B3D

is it a factorial?: (36*32)! = (1152)!

discuss =D

RhinoDevX64
  • 687
  • 1
  • 5
  • 13
  • 2
    Let's think that one through... if there were only two characters in the GUID, would it be (2*36)! ? 36*36 sounds more likely... Work it out for three chars, and then see what answer looks like it will make sense. – dave jones Feb 02 '11 at 18:38
  • Why would you think it's a factorial. That would only be the case if you couldn't repeat values. – Babak Naffas Feb 02 '11 at 18:42
  • I would bet that only one of those fields (e.g. E7E5BBE29B3D) is random. Others are fixed (e.g. by host or server instance) or based on current time. This seriously reduces the possibilities. – Arnaud Le Blanc Feb 02 '11 at 18:42
  • I was thinking 26 letters plus 10 possible numbers makes 36 possible values for a single position in the GUID excluding the dashes. not sure why i was thinking factorial, sketchy memory maybe!! =\ – RhinoDevX64 Feb 02 '11 at 18:43

6 Answers6

32

It's not clear what you're asking. I see two ways to interpret your question.

  1. Given a GUID g, what is the probability of someone guessing it? Let's assume for simplicity that all 128 bits of a GUID are available. Then the probability of guessing g is 2^-128. That's small. Let's get some intuition around that. Let's assume that our attacker can generate one billion GUIDs per second. To have a 50% chance of guessing g, our attacker would have to generate 2^127 GUIDs. At a rate of one billion per second, it would take 5391448762278159040348 years to generate 2^127 GUIDs.

  2. We are generating a collection of guids. What is the likelihood of a collision? That is, what is the likelihood that we generate two guids with the same value? This is the birthday paradox. If you generate a sequence of n GUIDs randomly, then the probability of at least one collision is approximately p(n) = 1 - exp(-n^2 / 2 * 2^128) (this is the birthday problem with the number of possible birthdays being 2^128).

n p(n) 2^30 1.69e-21 2^40 1.77e-15 2^50 1.86e-10 2^60 1.95e-03

So, even if you generate 2^60 GUIDs, the odds of a collision are extremely small. If you can generate one billion GUIDs per second, it would still take 36 years to have a 1.95e-03 chance of a collision.

Stu
  • 15,675
  • 4
  • 43
  • 74
jason
  • 236,483
  • 35
  • 423
  • 525
  • Is 2^64 correct? Half of 2^128 is 2^127. Statistics isn't my strong point, so maybe it only takes sqrt(n) to reach a 50% threshold. – Jimothy Nov 04 '14 at 19:55
6

The number of possible GUIDs (128-bit value) is 2^128 or 3.4×10^38 — roughly 2 trillion per cubic millimeter of the entire volume of the Earth.

In other words, kind of low.

(Source for Earth volume reference: WikiPedia)

Stu
  • 15,675
  • 4
  • 43
  • 74
3

Depends on the type of GUID generation algorithm. Current algorithms use 124 random bits so the probability is 1 in 2^124.

With older algorithms (that use time and MAC address) the probability is much higher.

John
  • 5,561
  • 1
  • 23
  • 39
3

There are a number of things wrong with your calculations. First off, 36*32 implies that any alphanumeric character can be part of the GUID. This is not the case; only HEX characters are allowed.

Secondly, the calculation for the number of unique GUIDs is Number of Valid Characters (16: 0-9, A-F) ^ length of GUID (32 characters )

So we have 16^32 ==> 2^(4^32) ==> 2^128

The odds of guessing any one GUID is 1 / 2^128.

Babak Naffas
  • 12,395
  • 3
  • 34
  • 49
  • This assumes that each single byte of the GUID is truly random. To ensure that GUIDs are unique among hosts, most parts of a UUID are actually fixed (e.g. a MAC address). Then the rest is padded with current time and a few bytes are random. And even those random bytes are guessable when you known some of the previously generated UUIDs.) Say 48 bits of MAC address + 64 of microtime. 128-48-64=16. I would say that the odds are closer to 2^16 than to 2^128. – Arnaud Le Blanc Feb 03 '11 at 12:52
1

It depends on how many GUIDs are generated. This is similar to the birthday problem in statistics. See Wikipedia and http://betterexplained.com/articles/understanding-the-birthday-paradox/ (this one specifically has a GUID example)

In general, the probability of a collision for generating M guids over N possible guids is 1 - (1- (1/N))^C(M,2) where C(M,2) is 'M choose 2'

dfb
  • 13,133
  • 2
  • 31
  • 52
0

It is 1 / (number of unique numbers possible with the given UID length). In the above example we see 16 bytes, or 128 bits. 2^128, so the probability of a match is 1 / 2^128.

Nathan Kidd
  • 2,919
  • 21
  • 22