11

I have integer values that are used to access data in unrelated data stores, i.e., handles. I have chosen to wrap the integers in a struct in order to have strongly typed objects so that the different integers cannot be mixed up. They are, and must be, POD. This is what I am using:

struct Mesh {
    int handle;
};
struct Texture {
    int handle;
};

I have arrays of these handles, such as: Texture* textureHandles;.

Sometimes I need to pass an array of handles as int* to more generic parts of the code. Right now I'm using:

int* handles = &textureHandles->handle;

which essentially takes a pointer to the first element of the struct and interprets it as an array.

My question is basically if this is legal, or if it violates strict aliasing to manipulate int* handles and Texture* textureHandles pointing to the same memory. I think this should be allowed since the underlying type (int) is accessed the same way in both cases. The reservation I have is related to the fact that I access multiple structs by taking the address of a member inside one struct.

As an extension to my first question, would the following be ok?

int* handles = reinterpret_cast<int*>(textureHandles);
rasmus
  • 3,136
  • 17
  • 22
  • You want to use structs to get strong types and then you want to cast away the type to get int. You get the worst of both worlds. – Neil Kirk Feb 15 '15 at 18:24
  • @NeilKirk Only very specific functions will use raw int* arrays. The rest will use the typed structs. They are simply there to avoid mistakes when using the handles in the general case. – rasmus Feb 15 '15 at 18:27
  • 1
    I think you should tell us more about your actual project as your design is very strange. – Neil Kirk Feb 15 '15 at 18:29
  • @NeilKirk Not sure why you think using handles is strange. Explaining the entire design could take a while but it comes from data oriented design where different systems/managers hold data that can be accessed using handles. These handles can be simple integers but makes it easy to pass the wrong handles to the wrong system/manager. I can probably avoid the described conversion if necessary, which is why I asked the question. – rasmus Feb 15 '15 at 18:38
  • For example, perhaps you could add `operator int()` to your handles to avoid reinterpret_casts. – Neil Kirk Feb 15 '15 at 18:42
  • 3
    @Neil Kirk Assume he is using OpenGL (probably the same with Direct3D). Using strong types will help you, but at the end of the day you still have to pass your int (or an array of int), to the API. It is undesiderable to copy an array of Texture to an array of int if they are binary identical. – sbabbi Feb 15 '15 at 20:32
  • 1
    @NeilKirk What's strange about it? I do almost exactly the same thing in all of my interfaces between languages, when an object is present in the code in one language, but must be accessed from another language. The swig generated interfaces do more or less the same thing. (The handle type may vary: mine is declared `void*`, and IIRC, the swig Java interface uses `long long`. But the idea is the same: the object is represented by a handle which is some sort of magic cookie which allows finding the object.) – James Kanze Feb 16 '15 at 11:57
  • Ok I stand corrected! – Neil Kirk Feb 16 '15 at 12:09

3 Answers3

10

reinterpret_cast<int*>(textureHandles) is definitely just as fine as &textureHandles->handle. There's a special exception in the standard, inherited from C even, that says that a pointer to a standard-layout structure, suitably converted, points to the initial member of that structure, and vice versa.

Using that to modify the handle is also fine. It doesn't violate aliasing rules, because you're using an lvalue of type int to modify a sub-object of type int.

Incrementing the resulting pointer, and using it to access other elements in an array of Texture objects, is a bit iffy, though. Jerry Coffin already pointed out that it is possible that sizeof(Texture) > sizeof(int). Even if sizeof(Texture) == sizeof(int), though, pointer arithmetic is only defined for pointers into arrays (where an arbitrary object may be considered as an array of length 1). You don't have an array of int anywhere, so the addition is simply undefined.

  • Are you sure it doesn't violate aliasing rules? What if I wrote to the member using an int pointer, and then read the member through a Texture pointer? – Neil Kirk Feb 15 '15 at 18:31
  • 1
    @NeilKirk That's still fine. You have an lvalue of type `Texture` for an object of type `Texture`, and an lvalue of type `int` for a sub-object of type `int`. There's no aliasing problem there. –  Feb 15 '15 at 18:33
5

No, this isn't guaranteed to work. In particular, the compiler is allowed to insert padding after any element of a struct, but is not allowed to insert padding between elements of an array.

That said, with a struct of only one element (of type int, or something at least as large, such as long), chances are pretty good that most compilers won't insert any padding, so your current usage is probably fairly safe as a general rule.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • You don't discuss strict aliasing which I think would be a problem here, but I'm not an expert. – Neil Kirk Feb 15 '15 at 18:28
  • For this question I'm only interested in the specific case where a struct contains a single member. Is the compiler really allowed to insert padding in that case? – rasmus Feb 15 '15 at 18:29
  • @rasmus: Yes, it is. It can't insert padding before the element, but can after it. – Jerry Coffin Feb 15 '15 at 18:30
  • @NeilKirk: In this case, I see no real reason to discuss strict aliasing rules. The point I've raised is sufficient to answer the question. – Jerry Coffin Feb 15 '15 at 18:32
  • Thanks for the answer. I initially accepted it but after some thinking I decided to go with hvd's answer since it also discusses strict aliasing which I specifically asked about. Sometimes I wish I could accept two answers. At least you got my +1 – rasmus Feb 15 '15 at 20:40
  • @rasmus: IMO, asking about strict aliasing in this situation is a bit like saying; "if I'm going to walk through a known high-crime neighborhood at night, carrying large amounts of money, what color of socks should I wear?" Rather than "white" or "black", the right answer is just "don't do that." – Jerry Coffin Feb 15 '15 at 20:50
  • @JerryCoffin Yes, I totally get that. I just felt that mentioning it would answer the question more fully because of how the question was posed. Or perhaps explain why its not relevant. Personally I found the additional information in hvd's answer related to strict aliasing to be useful. That could also help determining the validity of the method if the compiler actually guarantees that there will be no extra padding for this single member struct. MSVC does guarantee this for example. But like I said, I'm very grateful for your time and response. – rasmus Feb 15 '15 at 21:01
1

It certainly violates strict aliasing, and if the function can access the array both through the int* and a Mesh* or a Texture*, you may very well run into problems (although probably only if it modifies the array in some way).

From your description of the problem, I don't think the rules of strict aliasing are really what you are concerned with. The real issue is whether the compiler can add padding to the structs that isn't present in the int, so that sizeof( Mesh ) > sizeof( int ). And while the answer is formally yes, I can't imagine a compiler which would do so, at least today, and at least with int or larger types in the struct. (A word addressed machine would probably add padding to a struct which contained just char.)

The real question is probably more of whether the generic code is legacy, and cannot be changed, or not. Otherwise, the obvious solution is to create a generic handle type:

struct Handle
{
    int handle;
};

and then either derive your types from it, or use the reinterpret_cast as you propose. There is (or at least was) a guarantee that allowed accessing a member of a struct through a pointer to a different struct, as long as the member, and all preceding members were identical. This is how you simulate inheritance in C. And even if the guarantee has been removed—and the only reason it was ever present in C++ was for reasons of C compatibility—no compiler would dare violate it, given the amount of existing software that depends on it. (The implementation of Python, for example. And practically all Python plugins, including those written in C++.)

James Kanze
  • 150,581
  • 18
  • 184
  • 329
  • It doesn't violate strict aliasing to access objects or subobjects defined as `int` through lvalues of type `int`, and I really don't see how you might even make a legitimate argument otherwise. Could you elaborate? (The guarantee you point out doesn't actually guarantee what you think it does, even in C. It only works in unions. There is a different guarantee that does help here, that a pointer to a standard-layout struct points to its initial member.) –  Feb 15 '15 at 18:50
  • Note: I would wholeheartedly agree with your answer if it were the other way around. Given an arbitrary `int` object, attempting to access that as if it were a `Handle` is not a good idea, regardless of whether it's in an array. But that's not what the OP is asking about. –  Feb 15 '15 at 18:55
  • No part of the code is legacy. The code in question is used for mapping handles to data and have the ability to insert many handles at once into the map (hence the `int` arrays). But with the information from these awesome answers I'll probably rethink this part of the design. Problem with subclassing a Handle would be that the typed handles wouldn't be POD anymore. Which is a requirement for me. – rasmus Feb 15 '15 at 18:56
  • @hvd You have an interesting point. According to the standard, the compiler can assume that an `int*` and a `Texture*` don't alias. But the compiler must assume that the `int*` can alias with `p->handle` (where `p` is a `Texture*`). With regards to the guarantee only applying to unions: you may be right formally (I couldn't find the actual text with this guarantee when I posted), but in practice, large quantities of code (like Python) depend on it working when casting the pointers as well. (There are probably restrictions.) – James Kanze Feb 16 '15 at 11:52
  • @JamesKanze And if I recall correctly, Python builds on GCC with `-fno-strict-aliasing` because GCC does optimise in ways that don't work for Python. –  Feb 16 '15 at 11:57