Why can union be used this way?

Question

Isn't it true that members in union are exclusive, you can't refer to the other if you already refer to one of them?

union Ptrlist
{
    Ptrlist *next;
    State *s;
};

void
patch(Ptrlist *l, State *s)
{
    Ptrlist *next;

    for(; l; l=next){
        next = l->next;
        l->s = s;
    }
}

But the above is referring to both next and s at the same time, anyone can explain this?

score 5 · Answer 1 · answered Aug 09 '11 at 14:51

5

A union only defines that

&l->next == &l->s

that's all. There is no language-restriction of first accesses.

answered Aug 09 '11 at 14:51

Patrick B.

11,773
8
58
101

the &-operator has lower-priority than the ->-operator, so here the & returns the address of ->next and ->s respectively. – Patrick B. Aug 09 '11 at 15:14
@lexer: If both are the same size, say 4 bytes, they will essentially point to the same memory. – Aug 09 '11 at 15:25
@lexer: In [this](http://stackoverflow.com/questions/6352199/memory-layout-of-union-of-different-sized-member) question, the C99 standard is referenced saying that a pointer to the union is (under a conversion) a pointer to every one of the unions members. – Ken Wayne VanderLinde Aug 09 '11 at 15:49
@lexer: in this particular example yes: &l->next == &l->s == &l. but usually unions are used with struct-fields and in __packed__-format and then the fun starts. – Patrick B. Aug 09 '11 at 18:30

score 2 · Answer 2 · answered Aug 09 '11 at 15:15

As others have already pointed out, all members of a union are active at all times. The only thing to consider is whether the members are each in a valid state.

If you ever do want some level of exclusivity, you would instead require a tagged union. The basic idea is to wrap the union in a struct, and the struct has a member identifying which element in the union should be used. Take this example:

enum Tag {
    FIRST,
    SECOND
};

struct {
    Tag tag;
    union {
        int First;
        double Second;
    };
} taggedUnion;

Now taggedUnion could be used like:

if(taggedUnion.tag == FIRST)
    // use taggedUnion.First;
else
    // use taggedUnion.Second

Even though this is neat trick that could help you keep your pseduo-polymorphic code correct if you check the Tag value when appropriate; it should be pointed out that this guarantees no level of exclusion at all really. — nic, Aug 20 '13 at 14:01

score 1 · Answer 3 · 2011-08-09T15:07:30.920

1

You are performing an assignment to next from l->next. Then, you "overwrite" l->s through the assignment l->s = s.

When you assign to l->s, it overwrites the memory held in l->next. If next and s are the same "size", then both likely could be "active" at the same time.

edited Aug 09 '11 at 15:07

answered Aug 09 '11 at 14:49

ok obviously I screwed the pooch on a C Standard I obviously missed, so downvoter, please explain – Aug 09 '11 at 14:50
not the downvoter (wouldn't downvote your answer) but 'activate' should be replaced with 'overwrite', that's more correct – KevinDTimm Aug 09 '11 at 14:57
@Kevin: Ok, no problem. I always thought of it as activating. – Aug 09 '11 at 14:58
you probably know that a 'common' use of a union is to mask one type to another, something like `union x {char cx[4]; long lx;}` so that you can see the bytes of a long via assignation. there's no activation involved as both are 'active' all the time. – KevinDTimm Aug 09 '11 at 15:01
@Kevin: Yes, in that case you are correct. But that is only because 4 byte = same length at long (4 bytes) - correct? – Aug 09 '11 at 15:04
yes, but I do the same for others `union x {char cx[8]; int ix; long lx; double dx;};` NOTE: this is machine dependent but very common. – KevinDTimm Aug 09 '11 at 15:07
@Kevin: How does that work if `int` is 4 bytes, `long` is 4 bytes? Are you just masking the int and long and not the double? – Aug 09 '11 at 15:08
@CodeMonkey let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/2276/discussion-between-kevindtimm-and-code-monkey) – KevinDTimm Aug 09 '11 at 15:09

score 1 · Answer 4 · answered Aug 09 '11 at 14:51

1

Yes it's supposed to be like that. Both *s and *next point to the same memory location. And you can use both at the same time.. they are not exclusive.

answered Aug 09 '11 at 14:51

duedl0r

9,289
3
30
45

hobbs · Answer 5 · 2012-04-05T20:47:03.280

1

No, it's not true. You can use any member of a union at any time, although the results if you read a member that wasn't the one most recently written to are a little bit complicated. But that isn't even happening in your code sample and there's absolutely nothing wrong with it. For each item in the list its next member is read and then its s member is written, overwriting its next.

edited Apr 05 '12 at 20:47

answered Aug 09 '11 at 14:55

hobbs

223,387
19
210
288

a little bit complicated = undefined behaviour ? I know this is strictly UB in C++, but I don't know about C. – Alexandre C. Aug 09 '11 at 14:58
@Alexandre: How is it UB? He stores l->next into local next and then overwrites l->next with l->s, since only one member of a Union can be "active" at a time. – Aug 09 '11 at 14:59
@Code Monkey: OP's code is perfectly defined. I'm asking about the answer's sentence "the results if you read a member that wasn't the one most recently written to are a little bit complicated" – Alexandre C. Aug 09 '11 at 15:13
@Alexandre: In that case, yes - the sentence is confusing. I was particularly referring to "I know this is strictly UB" in your comment. I don't understand what you were saying. – Aug 09 '11 at 15:17
@Code Monkey: Quoting hobbs' answer: "the results if you read a member that wasn't the one most recently written to are a little bit complicated". In C++, reading another member than the last one written is plain UB, not "a little bit more complicated". I'd like to know if C has the same wording. – Alexandre C. Aug 09 '11 at 15:37
@Alexandre: I guess I don't understand. I thought unions were the same in C++ as they were in C – Aug 09 '11 at 15:38

score 0 · Answer 6 · answered Aug 09 '11 at 15:12

See http://publications.gbdirect.co.uk/c_book/chapter6/unions.html for an introductory discussion on unions.

Basically, it's an easy way to do type casting in advance. So, instead of having

int query_my_data(void *data, int data_len) {
  switch(data_len) {
    case sizeof(my_data_t): return ((my_data_t *)data)->value;
    case sizeof(my_other_data_t): return ((my_other_data_t *)data)->other_val;
    default: return -1;
  }

You could simplify it by doing

typedef struct {
  int data_type;
  union {
    my_data_t my_data;
    my_other_data_t other_data;
  } union_data;
} my_union_data_t;

int query_my_data(my_union_data_t *data) {
  switch(data->data_type) {
    case TYPE_MY_DATA: return data->union_data.my_data.value;
    case TYPE_MY_OTHER_DATA: return data->union_data.other_data.other_val;
    default: return -1;
  }

Where my_data and other_data would have the same starting address in memory.

score 0 · Answer 7 · answered Aug 11 '11 at 11:03

An union is similar to a struct, but it only allocates memory for a variable. The size of the union will be equal to the size of the largest type stored in it. For example:

union A {
    unsigned char c;
    unsigned short s;
};

int sizeofA = sizeof(A); // = 2 bytes

union B {
    unsigned char c[4];
    unsigned short s[2];
    unsigned int i;
};

int sizeofB = sizeof(B); // = 4 bytes

In the second example, s[0] == (c[1] << 8) & #ff00 | c[0];. The variables c, s and i overlap.

B b;
// This assignment
b.s[0] = 0;
// is similar to:
b.c[0] = 0;
b.c[1] = 0;

An union is restricted to primitive types and pointers. In C++, you cannot store classes in a union. All other rules remain basically the same as for a structure, such as public access, stack allocation and such.

Thus, in your example you must use a struct instead of an union.

Why can union be used this way?

7 Answers7