3

One hard working day I noticed that GUIDs I've been generating with usual .NET's Guid.NewGuid() method had the same number 4 in the beginning of the third block:

efeafa5f-fe21-4ab4-ba82-b9eefd5fa225
480b64d0-6762-4afe-8496-ac7cf3292898
397579c2-a4f4-4611-9fda-16e9c1e52d6a
...

There were ten of them appearing on the screen once a second or so. I've kept my eye on this pattern right after the fifth GUID. Finally, the last one had the same four bits inside and I've decided that I'm a lucky guy. I went home and felt that the whole world is opened for such an exceptional person as me. Next week I found a new work, cleaned my room and made a call to my parents.

But today I've faced the same pattern again. Thousand times. And I don't feel the Chosen One anymore.

I've googled it and now I know about UUID and a canonical format with 4 reserved bits for version and 2 for variant.

Here's a snippet to experiment with:

static void Main(string[] args)
{
    while (true)
    {
        var g = Guid.NewGuid();
        Console.WriteLine(BitConverter.ToString(g.ToByteArray()));
        Console.WriteLine(g.ToString());
        Console.ReadLine();
    }
}

But still there is one thing I don't understand (except how to go on living). Why do we need these reserved bits? I see how it can harm - exposing internal implementation details, more collisions (still nothing to worry about, but one day...), more suicides - but I don't see any benefit. Can you help me to find any?

Inside GUID generation algorythm

astef
  • 8,575
  • 4
  • 56
  • 95

1 Answers1

5

It is so that if you update the algorithm you can change that number. Otherwise 2 different algorithms could produce the exact same UUID for different reasons, leading to a collision. It is a version identifier.

For example, consider a contrived simplistic UUID format:

00000000-00000000
  time  -   ip

now suppose we change that format for some reason to:

00000000-00000000
   ip   -  time

This could generate a collision when a machine with IP 12.34.56.78 generates a UUID using the first method at time 01234567, and later a second machine with IP 01.23.45.67 generates a UUID at time 12345678 using the newer method. But if we reserve some bits for a version identifier, this cannot possibly cause a collision.

The value 4 specifically refers to a randomly generated UUID (therefore it relies on the miniscule chance of collisions given so many bits) rather than other methods which could use combinations of the time, mac address, pid, or other sorts of time & space identifiers to guarantee uniqueness.

See here for the relevant spec: https://www.rfc-editor.org/rfc/rfc4122#section-4.1.3

Community
  • 1
  • 1
Dave
  • 44,275
  • 12
  • 65
  • 105
  • Most of developers don't think about coding time or IP in their GUIDs. We just need unique identifier for a session or something else. But default GUID generators implementations force us to remember these features. Isn't it unfair? – astef Jan 10 '15 at 17:20
  • 1
    If you care enough to worry about the 4 bits you're losing to this, then you should care enough to be using a UUID which doesn't depend on random chance for uniqueness, and if you're using a UUID like that, then you need a UUID which has a version identifier. So, it doesn't seem unfair to me at all! Besides, if in the future you decide to switch to a guaranteed-unique method, this version ID means you won't need to worry about collisions with your legacy IDs. – Dave Jan 10 '15 at 17:22
  • Of course I care about GUIDs uniqueness. `.NET`'s generator (`Guid.NewGuid()`) depends on random chance of uniqueness (except those _6_ bits). So I don't really need this version identifier – astef Jan 10 '15 at 17:35
  • But the point is that someday you *might want to change your method*. Since right now you don't need that, everything's fine. But if/when you want to switch it, you'll have the benefit of not needing to worry about collisions because of your old values. By fixing the values (even in the random version), the spec guarantees that the upgrade path is available (even for people who don't consider that they might someday upgrade). If you're 100% sure you don't need it and never will, you could easily make your own generator. But I'd advise against it. – Dave Jan 10 '15 at 17:45