5

I understand that this doesn't take a significant chunk off of the entropy involved, and that even if a whole nother character of the GUID was reserved (for any purpose), we still would have more than enough for every insect to have one, so I'm not worried, just curious.

As this great answer shows, the Version 4 algorithm for generating GUIDs has the following format:

xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx
  1. x is random
  2. 4 is constant, this represents the version number.
  3. y is one of: 8, 9, A, or B

The RFC spec for UUIDs says that these bits must be set this way, but I don't see any reason given.

Why is the third bullet (the 17th digit) limited to only those four digits?

Community
  • 1
  • 1
NH.
  • 2,240
  • 2
  • 23
  • 37

3 Answers3

4

Bits, not hex

Focusing on hexadecimal digits is confusing you.

A UUID is not made of hex. A UUID is made of 128 bits.

Humans would resent reading a series of 128 bits presented as a long string of 1 and 0 characters. So for the benefit of reading and writing by humans, we present the 128-bits in hex.

Always keep in mind that when you see the series of 36 hex characters with hyphens, you are not looking at a UUID. You are looking at some text generated to represent the 128-bits of that are actually in the UUID.

Version & Variant

The first special meaning you mention, the “version” of UUID, is recorded using 4 bits. See section 4.1.3 of your linked spec.

The second special meaning you indicate is the “variant”. This value takes 1-3 bits. This See section 4.1.1 of your linked spec.

A hex character represents 4 bits (half an octet).

  • The Version number, being 4 bits, takes an entire a single hex character to itself.
  • Version 4 specifically uses the bits 01 00 which in hex is 4 as it is too in decimal (base 10) numbers.
  • The Variant, being 1-3 bits, does not take an entire hex character.
  • Outside the Microsoft world of GUIDs, the rest of the industry nowadays uses two bits: 10, for a decimal value of 2, as the variant. This pair of bits lands in the most significant bits of octet # 8. That octet looks like this, where ‘n’ means 0 or 1: 10 nn nn nn. A pair of hex characters represent each half of that octet. So your 17th hex digit, the first half of that 8th octet, 10 nn, can only have four possible values:
    • 10 00 (hex 8)
    • 10 01 (hex 9)
    • 10 10 (hex A)
    • 10 11 (hex B)
Community
  • 1
  • 1
Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
3

Quoting the estimable Mr. Lippert

First off, what bits are we talking about when we say “the bits”? We already know that in a “random” GUID the first hex digit of the third section is always 4....there is additional version information stored in the GUID in the bits in the fourth section as well; you’ll note that a GUID almost always has 8, 9, a or b as the first hex digit of the fourth section. So in total we have six bits reserved for version information, leaving 122 bits that can be chosen at random.

(from https://ericlippert.com/2012/05/07/guid-guide-part-three/)

tl;dr - it's more version information. To get more specific than that I suspect you're going to have to track down the author of the spec.

Mike G
  • 4,232
  • 9
  • 40
  • 66
0

Based upon what I've tried to learn today, I've attempted to put together a C#/.NET 'LINQPad' snippet/script, to (for some small part) breakdown the GUID/UUID (- in case it helps):

void Main()
{
    var guid =
        Guid.Parse(
                //@"08c8fbdc-ff38-402e-b0fd-353392a407af"  // v4 - Microsoft/.NET
                @"7c2a81c7-37ce-4bae-ba7d-11123200d59a"  // v4
                //@"493f6528-d76a-11ec-9d64-0242ac120002"  // v1
                //@"5f0ad0df-99d4-5b63-a267-f0f32cf4c2a2"  // v5
            );

    Console.WriteLine(
        $"UUID = '{guid}' :");

    Console.WriteLine();

    var guidBytes =
        guid.ToByteArray();

    // Version # - 8th octet

    const int timeHiAndVersionOctetIdx = 7;

    var timeHiAndVersionOctet =
        guidBytes[timeHiAndVersionOctetIdx];

    var versionNum =
        (timeHiAndVersionOctet & 0b11110000) >> 4;  // 0xF0

    // Variant # - 9th octet

    const int clkSeqHiResOctetIdx = 8;

    var clkSeqHiResOctet =
        guidBytes[clkSeqHiResOctetIdx];

    var msVariantNum =
        (clkSeqHiResOctet & 0b11100000) >> 5;  // 0xE0/3 bits

    var variantNum =
        (clkSeqHiResOctet & 0b11000000) >> 5;  // 0xC0/2bits - 0x8/0x9/0xA/0xB

    //

    Console.WriteLine(
        $"\tVariant # = '{variantNum}' ('0x{variantNum:X}') - '0b{((variantNum & 0b00000100) > 0 ? '1' : '0')}{((variantNum & 0b00000010) > 0 ? '1' : '0')}{((variantNum & 0b00000001) > 0 ? '1' : '0')}'");
    Console.WriteLine();

    if (variantNum < 4)
    {
        Console.WriteLine(
            $"\t\t'0 x x' - \"Reserved, NCS backward compatibility\"");
    }
    else
    {
        if (variantNum == 4 ||
            variantNum == 5)
        {
            Console.WriteLine(
                $"\t\t'1 0 x' - \"The variant specified in this {{RFC4122}} document\"");
        }
        else
        {
            if (variantNum == 6)
            {
                Console.WriteLine(
                    $"\t\t'1 1 0' - \"Reserved, Microsoft Corporation backward compatibility\"");
            }
            else
            {
                if (variantNum == 7)
                {
                    Console.WriteLine(
                        $"\t\t'1 1 1' - \"Reserved for future definition\"");
                }
            }
        }
    }

    Console.WriteLine();

    Console.WriteLine(
        $"\tVersion # = '{versionNum}' ('0x{versionNum:X}') - '0b{((versionNum & 0b00001000) > 0 ? '1' : '0')}{((versionNum & 0b00000100) > 0 ? '1' : '0')}{((versionNum & 0b00000010) > 0 ? '1' : '0')}{((versionNum & 0b00000001) > 0 ? '1' : '0')}'");
    Console.WriteLine();

    string[] versionDescriptions =
        new[]
        {
            @"The time-based version specified in this {{RFC4122}} document",
            @"DCE Security version, with embedded POSIX UIDs",
            @"The name-based version specified in this {{RFC4122}} document that uses MD5 hashing",
            @"The randomly or pseudo-randomly generated version specified in this {{RFC4122}} document",
            @"The name-based version specified in this {{RFC4122}} document that uses SHA-1 hashing"
        };

    Console.WriteLine(
        $"\t\t'{versionNum}' = \"{versionDescriptions[versionNum - 1]}\"");
    Console.WriteLine();

    Console.WriteLine(
        $"'RFC4122' document - <https://datatracker.ietf.org/doc/html/rfc4122#section-4.1.1>");
    Console.WriteLine();
}
DennisVM-D2i
  • 416
  • 3
  • 8