4

I want to check if a GUID structure is empty/all fields are 0. This is the code I wrote:

#include <windows.h>

static BOOL IsEmptyGuid(const GUID * const pGuid)
{
    return \
    (pGuid->Data1 == 0) &&
    (pGuid->Data2 == 0) &&
    (pGuid->Data3 == 0) &&
#ifdef _WIN64
    (*(DWORD64 *)pGuid->Data4 == 0);
#else
    (*(DWORD *)pGuid->Data4 == 0) && (*(DWORD *)(pGuid->Data4 + 4) == 0);
#endif
}

/* GUID definition from MSDN
typedef struct _GUID {
    DWORD Data1;
    WORD  Data2;
    WORD  Data3;
    BYTE  Data4[8];
} GUID;
*/


int main() {
    GUID guid1;
    guid1.Data1 = 0;
    guid1.Data2 = 0;
    guid1.Data3 = 0;
    memset(guid1.Data4, 0x0, 8);

    printf("Result: %u\n", IsEmptyGuid(&guid1));
}

A safer way to check if field Data4 equals 0 is to iterate over every byte and check for it's value. But, I find the code from above more expressive.

I would like to know, is it correct? Is it safe?

Thank you!

  • 1
    @RbMm Please don't give incorrect answers in comment section where they cannot be downvoted. – user694733 Jul 31 '17 at 08:52
  • @user694733 - why you think that this is incorrect or not safe ? absolute correct. and i think it too small to be answer – RbMm Jul 31 '17 at 08:53
  • aside: That backslash in `return \` achieves nothing. – underscore_d Jul 31 '17 at 09:06
  • 3
    Is there a reason why not to create an empty GUID with `GUID guid_empty = { 0, 0, 0, { 0, 0, 0, 0, 0, 0, 0, 0 } };` and then using built-in function `IsEqualGUID`? – vasek Jul 31 '17 at 09:12
  • 2
    No, I just didn't know it exists. I looked for a function that tells me if it's empty. That's a change of perspective, to look if it's equal. Nice, thank you! – thrillseeker Jul 31 '17 at 09:16
  • 4
    Just don't reinvent this wheel, use IsEqualGUID() to compare to GUID_NULL. – Hans Passant Jul 31 '17 at 09:25

2 Answers2

3

Code is incorrect. It breaks strict aliasing rule (N1570 §6.5 p7), causing undefined behaviour.

An object shall have its stored value accessed only by an lvalue expression that has one of the following types: 88)

  • a type compatible with the effective type of the object,
  • a qualified version of a type compatible with the effective type of the object,
  • a type that is the signed or unsigned type corresponding to the effective type of the object,
  • a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
  • a character type.

88) The intent of this list is to specify those circumstances in which an object may or may not be aliased.

To be exact, UB happens when you dereference pointer using non-matching type:

DWORD64 * temp = (DWORD64 *)pGuid->Data4; // Allowed, but implementation defined
DWORD64 temp2 = *temp;                    // Undefined behaviour

Use loop to check each element individually, or compare with memcmp to zero filled array of the same size.


As noted in the comments, some compilers allow disabling strict aliasing, but that should be avoided as it makes code less portable, and you still have potential alignment issues.

Community
  • 1
  • 1
user694733
  • 15,208
  • 2
  • 42
  • 68
  • here nothing undefined. member layout (offsets) of all fields well known. only one is bad for `_WIN64` - that align of `Data4` is 4 byte, when checked 8 bytes – RbMm Jul 31 '17 at 09:00
  • I think this answer is too restrictive: MS has a long history of allowing type-punning, and they don't seem to plan on disallowing it in future. See also [here](https://stackoverflow.com/q/37176461/509868). – anatolyg Jul 31 '17 at 09:04
  • 1
    @RbMm That _is_ UB. It is defined to reinterpret other types through `[[un]signed] char*` pointers but not the other way around. (assuming `BYTE` is definitely a `char` type anyway) – underscore_d Jul 31 '17 at 09:04
  • 1
    Just keep saying that and it'll magically become true. Alternatively, cite a source that backs you up. – underscore_d Jul 31 '17 at 09:05
  • I'd say it is correct too. Data4 is aligned on 64 bit address when on 64 bit and 32 bit address when on 32 bit processor so it's perfectly safe to cast to 64bit pointer – Serve Laurijssen Jul 31 '17 at 09:07
  • 1
    This doesn't matter. This answer is correct, accessing a `char` array through a pointer of different type is *undefined behavior*. –  Jul 31 '17 at 09:10
  • 1
    It might well happen to work on MS's compiler, and maybe they're unlikely to change that. That doesn't make it "correct" or good code. There are simple ways to get fully defined behaviour here, particularly because it's C (not C++), so there's no reason to keep using brittle code. – underscore_d Jul 31 '17 at 09:10
  • @underscore_d - `It might well happen to work on MS's compiler, ` - it will be work for all compilers, otherwice code generated between this compilers become binary incomatible – RbMm Jul 31 '17 at 09:13
  • 1
    @RbMm You keep saying this with no substantiation whatsoever. It has been proven time and time again that compilers can use aliasing rules to optimise away invalid code like yours. That has nothing to do with your vague notion of 'binary compatibility', which everyone else manages to achieve without writing fundamentally broken code. – underscore_d Jul 31 '17 at 09:14
  • @underscore_d - the `GUID` struct used in many windows api as agrument - so any generated binary code (of any compiler) must interpret `GUID` in the **SAME** way as windows binaries – RbMm Jul 31 '17 at 09:16
  • 1
    @RbMm They must interpret it in the same way when told to interpret it using well-defined code. That has nothing to do with whether they should allow invalid aliasing. At this point, you've just got your fingers in your ears singing _la la la I can't hear you_. – underscore_d Jul 31 '17 at 09:17
  • 1
    @user694733 you might want to cite C11 §6.5, section 7 so this ridiculous discussion ends... –  Jul 31 '17 at 09:18
  • @FelixPalmen - if some compiler will be another interpret `GUID` fields and layout - it code became incompatible with windows – RbMm Jul 31 '17 at 09:18
-6

when _WIN64 not defined

(*(DWORD *)pGuid->Data4 == 0) && (*(DWORD *)(pGuid->Data4 + 4) == 0);

case - this is absolute safe and correct - the Data4 member is align to DWORD (4) and have size of 2 DWORD (2 * sizeof(DWORD) == 8)

for _WIN64 the align of Data4 still DWORD (4) because all structure align is 4 ( C_ASSERT(__alignof(GUID)==4) ) - when code (*(DWORD64 *)pGuid->Data4 == 0); assume 8 byte data align reference. for say x64 usually processor not generate exception when referenced to unaligned data. however on another platform this can be. in any case access to unaligned data bad for performance. so check must be:

BOOL IsEmptyGuid(const GUID * pGuid)
{
    return (pGuid->Data1 == 0) &&
        (pGuid->Data2 == 0) &&
        (pGuid->Data3 == 0) &&
        (*(DWORD *)pGuid->Data4 == 0) && (*(DWORD *)(pGuid->Data4 + 4) == 0);
}

for all platforms.

and about member offsets, align etc. it well known and here can not be any UB. otherwise will be impossible any binary interface between to piece of code


interesting, what is not correct here:

  1. Data4 not pointer to 8 continuous bytes ?
  2. Data4 have not 8 bytes offset from GUID ?
  3. Data4 not aligned on 4 byte (of course if pGuid itself properly aligned) ?
  4. we can not check 8 byte memory (aligned on DWORD) as 2 DWORD ?

and some code from windows headers files:

// Faster (but makes code fatter) inline version...use sparingly
#ifdef __cplusplus
__inline int InlineIsEqualGUID(REFGUID rguid1, REFGUID rguid2)
{
   return (
      ((unsigned long *) &rguid1)[0] == ((unsigned long *) &rguid2)[0] &&
      ((unsigned long *) &rguid1)[1] == ((unsigned long *) &rguid2)[1] &&
      ((unsigned long *) &rguid1)[2] == ((unsigned long *) &rguid2)[2] &&
      ((unsigned long *) &rguid1)[3] == ((unsigned long *) &rguid2)[3]);
}

__inline int IsEqualGUID(REFGUID rguid1, REFGUID rguid2)
{
    return !memcmp(&rguid1, &rguid2, sizeof(GUID));
}

#else   // ! __cplusplus

#define InlineIsEqualGUID(rguid1, rguid2)  \
        (((unsigned long *) rguid1)[0] == ((unsigned long *) rguid2)[0] &&   \
        ((unsigned long *) rguid1)[1] == ((unsigned long *) rguid2)[1] &&    \
        ((unsigned long *) rguid1)[2] == ((unsigned long *) rguid2)[2] &&    \
        ((unsigned long *) rguid1)[3] == ((unsigned long *) rguid2)[3])

#define IsEqualGUID(rguid1, rguid2) (!memcmp(rguid1, rguid2, sizeof(GUID)))

#endif  // __cplusplus

#ifndef __IID_ALIGNED__
    #define __IID_ALIGNED__
    #ifdef __cplusplus
        inline int IsEqualGUIDAligned(REFGUID guid1, REFGUID guid2)
        {
            return ((*(PLONGLONG)(&guid1) == *(PLONGLONG)(&guid2)) && (*((PLONGLONG)(&guid1) + 1) == *((PLONGLONG)(&guid2) + 1)));
        }
    #else // !__cplusplus
        #define IsEqualGUIDAligned(guid1, guid2) \
            ((*(PLONGLONG)(guid1) == *(PLONGLONG)(guid2)) && (*((PLONGLONG)(guid1) + 1) == *((PLONGLONG)(guid2) + 1)))
    #endif // !__cplusplus
#endif // !__IID_ALIGNED__
RbMm
  • 31,280
  • 3
  • 35
  • 56
  • 3
    "here can not be any UB". And yet, there is. Dereferencing a type-punned pointer breaks aliasing rules. The Standard specifically disallows this. When there are easy ways to do this in a defined way, why would you so enthusiastically encourage people to write brittle code? – underscore_d Jul 31 '17 at 09:12
  • @underscore_d - i have safe vision. and i good know that this is absolute correct. all members offset and layout is know. no any ub – RbMm Jul 31 '17 at 09:15
  • You keep pointing out unrelated reasons for which this code is OK while ignoring everyone's criticism of the fundamental reason why it is not. That seems awfully convenient to me. – underscore_d Jul 31 '17 at 09:15
  • 1
    its a pedantic discussion. from standard pov it's UB and therefore incorrect, but there is likely not a single compiler where it wouldnt work as the address are aligned correctly – Serve Laurijssen Jul 31 '17 at 09:22
  • 2
    @ServéLaurijssen I don't think it's pedantic to advise against writing code that contradicts the definition of the language and is not required to work on any compiler, especially when well-defined alternatives exist and are easy to write. – underscore_d Jul 31 '17 at 09:24
  • @ServéLaurijssen - if exist this comiler - code generated by it simply will be incompatible with windows (winapi). take for example any api which used guid. `StringFromCLSID` say - how can be binary interface if different compilers different interpret `guid` ?? – RbMm Jul 31 '17 at 09:26
  • 1
    @RbMm You don't seem to want to acknowledge this point and how it invalidates your attempt at argument, but here I go again: There is no risk of incompatibility if all the client code code is properly written, i.e. well-defined and Standard-compliant. Violating aliasing rules is not that. – underscore_d Jul 31 '17 at 09:27
  • @underscore_d - can be many alternatives for check guid data. but question was - are code correct. guid is only 16 bytes aligned to 4 byte. – RbMm Jul 31 '17 at 09:27
  • @RbMm and again, the code is **not** correct if its way of interpreting the GUID involves breaking aliasing rules. – underscore_d Jul 31 '17 at 09:28
  • @underscore_d - no any risk of incomtible. otherwise any binary windows interface, any windows api impossible. all *c* structs used in winapi must be equal interprets by all compilers – RbMm Jul 31 '17 at 09:29
  • 3
    @RbMm You're writing a lot of nonsense because you don't know how compilers make use of the strict aliasing rule: They are allowed to optimize code away based on the assumption that the code is written correctly. (the binary representation of any type has **nothing** to do with it) –  Jul 31 '17 at 09:30
  • @FelixPalmen - i know what i wrote. because i programming not only on c/c++ but on asm too and very good know how all internal worked – RbMm Jul 31 '17 at 09:31
  • 1
    @RbMm I don't doubt that, problem is that you don't know how C (in terms of the abstract machine) works. –  Jul 31 '17 at 09:32
  • @FelixPalmen - please explain (how you think) possible binary interface - when 2 different `PE` call function with `guid*` as argument ? sense in that all compiler must equal interpret it layout – RbMm Jul 31 '17 at 09:34
  • @RbMm Read it again: _they must interpret the layout the same, using well-defined code_. If they use broken code to interpret it, who cares what happens? So, simply accessing members, or `memcpy()`ing, or whatever, works fine. Breaking aliasing rules is not required to work. The same is true even if the object that gets aliased is an `int`, not a `struct`. – underscore_d Jul 31 '17 at 09:37
  • 1
    @RbMm what they are saying is that when a compiler encounters this code it's allowed by the standard to do anything whatsoever as it is UB and thats correct. – Serve Laurijssen Jul 31 '17 at 09:41
  • @underscore_d - Data4 this is continuous 8 byte data aligned to 4 byte. question was are code is correct (not are it best, most effective, most elegant, etc). formally it correct (for 32 bit case) and here no any UB - function always return correct answer. – RbMm Jul 31 '17 at 09:44
  • @ServéLaurijssen - from binary view here no any UB. function always return correct answer. `Data4` is only continuous 8 byte data aligned to 4 byte. nothing more – RbMm Jul 31 '17 at 09:46
  • 2
    I give up. You refuse to engage with the actual point and think repeating something else often enough will drown it out. Per the Standard, the code is UB. I don't care if it works on your compiler. It's not required to, and it's bad advice. The end. – underscore_d Jul 31 '17 at 09:50
  • @underscore_d - if this will be not work on another compiler - code generated by this compiler will be binary incompatible with windows. because this can not work only if some compiler another interpret `Data4` pointer – RbMm Jul 31 '17 at 09:52
  • @RbMm That's correct. Your code is allowed to break on another compiler because it's not Standard-compliant. So it's bad code and irresponsible. That doesn't mean that aliasing rules make _all_ binary-compatibility impossible: it just means your code is bad. I don't know how else to explain this that hasn't already been tried. – underscore_d Jul 31 '17 at 09:54
  • @underscore_d - look like you not low level programming, dont know assembler, memory, how all this work internally. memory is only a memory. here we have 8 bytes of memory aligned to dword. we can check it byte by byte. but can and as 2 dword – RbMm Jul 31 '17 at 10:24
  • 1
    @RbMm Your attempts at evasion are quite incredible - now trying to distract from the point by questioning my understanding or ability. When the point is that **your** understanding fails to include aliasing rules in any way, your code violates said rules, and you have never **once** tried to engage with that - instead throwing out irrelevant red herrings and hoping that'll convince people. I'm really beginning to wonder whether you're actively trolling here. No one is disagreeing with you over the memory layout in practice, but either way, interpreting it via a different type of pointer is UB – underscore_d Jul 31 '17 at 10:26
  • @underscore_d - ok - i paste special for you microsoft code from `guiddef.h` and `wdm.h` - llok how they access guids fields - this code also UB and wrong ? – RbMm Jul 31 '17 at 10:28
  • @RbMm I don't really care if Microsoft do it; whether any entity writes code in a particular way doesn't mean it's inherently correct code; none of them are infallible. Having said that, implementors are in some cases allowed to subvert the Standard if they have special knowledge of the compiler in which their code will be compiled, and special/different/implementation-defined behaviours it may have - so looking to internal implementations for ideas on how to write good client code is frequently discouraged (also for how indecipherable they tend to be). bottom line: MS don't write the Standard – underscore_d Jul 31 '17 at 10:30
  • @underscore_d - We have so different understanding of language, programming and binary standards that we will never agree – RbMm Jul 31 '17 at 10:32
  • Yes, until you actually read and acknowledge the part of the C Standard that has been shown to you for 2 hours now, that is impossible. – underscore_d Jul 31 '17 at 10:33
  • 1
    This is really stupid. **Of course**, Microsoft can exploit special guarantees they make **themselves in their own compiler** safely. If they change how their compiler works, they're in charge of changing other code as well. If you're just writing a C program, your only safe bet is to rely on the standard. –  Jul 31 '17 at 10:35
  • @underscore_d - i can not only read docs but understand how this work internally, on binary level. if you claim that this can be ub - very interesting look for concrete example of this. and `guid` only data struct which can be used not only from c/c++ but from another languages too. it have clear binary layout – RbMm Jul 31 '17 at 10:36
  • 1
    I'd say stop feeding the troll here. –  Jul 31 '17 at 10:36
  • @FelixPalmen - this code (InlineIsEqualGUID, IsEqualGUIDAligned) wrote not i however. simply copy-paste from sdk-wdk. you can check this if not believe – RbMm Jul 31 '17 at 10:37