32

I have a example with me where in which the alignment of a type is guaranteed, union max_align . I am looking for a even simpler example in which union is used practically, to explain my friend.

timrau
  • 22,578
  • 4
  • 51
  • 64
yesraaj
  • 46,370
  • 69
  • 194
  • 251

18 Answers18

33

I usually use unions when parsing text. I use something like this:

typedef enum DataType { INTEGER, FLOAT_POINT, STRING } DataType ;

typedef union DataValue
{
    int v_int;
    float v_float;
    char* v_string;
}DataValue;

typedef struct DataNode
{
    DataType type;
    DataValue value;
}DataNode;

void myfunct()
{
    long long temp;
    DataNode inputData;

    inputData.type= read_some_input(&temp);

    switch(inputData.type)
    {
        case INTEGER: inputData.value.v_int = (int)temp; break;
        case FLOAT_POINT: inputData.value.v_float = (float)temp; break;
        case STRING: inputData.value.v_string = (char*)temp; break;
    }
}

void printDataNode(DataNode* ptr)
{
   printf("I am a ");
   switch(ptr->type){
       case INTEGER: printf("Integer with value %d", ptr->value.v_int); break;
       case FLOAT_POINT: printf("Float with value %f", ptr->value.v_float); break;
       case STRING: printf("String with value %s", ptr->value.v_string); break;
   }
}

If you want to see how unions are used HEAVILY, check any code using flex/bison. For example see splint, it contains TONS of unions.

Koray Tugay
  • 22,894
  • 45
  • 188
  • 319
Yousf
  • 3,957
  • 3
  • 27
  • 37
6

I've typically used unions where you want to have different views of the data e.g. a 32-bit colour value where you want both the 32-bit val and the red,green,blue and alpha components

struct rgba
{
  unsigned char r;
  unsigned char g;
  unsigned char b;
  unsigned char a;
};

union  
{
  unsigned int val;
  struct rgba components;
}colorval32;

NB You could also achieve the same thing with bit-masking and shifting i.e

#define GETR(val) ((val&0xFF000000) >> 24)

but I find the union approach more elegant

chirag
  • 3
  • 3
zebrabox
  • 5,694
  • 1
  • 28
  • 32
  • 2
    But `struct rgba` may have padding, and even without padding, `sizeof(unsigned int)` may or may not be equal to `sizeof(struct rgba)` (4 for no padding). I.e., you probably don't want to do this. – Alok Singhal Dec 23 '09 at 15:52
  • 3
    @Alok: Not to mention that the `struct` will have `r`, `g`, `b`, and `a` in order (with or without padding), while the `unsigned int` byte order will depend on endianness. In other words, this code is not portable and may fail in odd ways when compiling on another processor. – David Thornley Dec 23 '09 at 18:44
  • 7
    Do note that (standard) C does not support this - assigning to one member in a union and then reading from another member is undefined behavior. Compilers/systems implementations do commonly support this though in their own specific ways. – nos Dec 23 '09 at 20:13
  • 1
    Thanks for the comments all. As you say this isn't portable ( nor is it intended to be) and relies on many assumptions which I didn't make clear in the post. Typically I wouldn't be using unsigned int but something like a typedef uint32_t i.e where I knew it was 32-bits. I'll amend my post to make this clearer – zebrabox Dec 24 '09 at 14:00
  • In addition to the portability issues, it's also a great way to confuse the compiler with potential aliasing, forcing it to skip a load of otherwise useful optimizations. – jalf Jan 29 '10 at 19:09
  • about what nos said; I think he means this is undefined behavior because of the strict aliasing rule. this should work when you use char as your "other" union types. – v.oddou Jun 29 '14 at 08:50
5

For accessing registers or I/O ports bytewise as well as bitwise by mapping that particular port to memory, see the example below:

    typedef Union
{
  unsigned int a;
struct {
  unsigned bit0 : 1,
           bit1 : 1,
           bit2 : 1,
           bit3 : 1,
           bit4 : 1,
           bit5 : 1,
           bit6 : 1,
           bit7 : 1,
           bit8 : 1,
           bit9 : 1,
           bit10 : 1,
           bit11 : 1,
           bit12 : 1,
           bit13 : 1,
           bit14 : 1,
           bit15 : 1
} bits;
} IOREG;

# define PORTA (*(IOREG *) 0x3B)
...
unsigned int i = PORTA.a;//read bytewise
int j = PORTA.bits.bit0;//read bitwise
...
PORTA.bits.bit0 = 1;//write operation
wrapperm
  • 1,266
  • 12
  • 18
  • 6
    Never use this if you have to handle big/little endian conversions – mouviciel Dec 23 '09 at 09:02
  • 1
    @mouviciel: This is not affected by endianness... can you please support your comment, if so..? – wrapperm Dec 23 '09 at 11:13
  • 8
    From the C standard: "The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified." – Alok Singhal Dec 23 '09 at 15:56
  • 1
    However, in many cases registers or I/O ports *are* platform-specific, so platform-independent code isn't such a concern. Though I suppose you might write a driver for some hardware that is used on both big-endian and little-endian machines (e.g. a Linux driver for a graphics chip)... in that case you'd know what you're doing I'm sure. – Craig McQueen Dec 24 '09 at 00:04
  • 1
    The alignment of the addressable storage unit can be easily determined by a simple program... actually speaking you should have an idea about the architectural endianness and stuff like that before coding for a particular platform... – wrapperm Dec 24 '09 at 07:31
  • 1
    This misses a disclaimer that says "Reading from inactive union members yields undefined behavior". – Sebastian Mach Jul 08 '11 at 08:44
3

In the Windows world, unions are commonly used to implement tagged variants, which are (or were, before .NET?) one standard way of passing data between COM objects.

The idea is that a union type can provide a single natural interface for passing arbitrary data between two objects. Some COM object could pass you a variant (e.g. type VARIANT or _variant_t) which could contain either a double, float, int, or whatever.

If you have to deal with COM objects in Windows C++ code, you'll see variant types all over the place.

Nate Kohl
  • 35,264
  • 10
  • 43
  • 55
2
struct cat_info
{
int legs;
int tailLen;
};

struct fish_info
{
bool hasSpikes;
};


union 
{
fish_info fish;
cat_info cat;
} animal_data;

struct animal
{
char* name;
int animal_type;
animal_data data;
};
ironic
  • 8,368
  • 7
  • 35
  • 44
2

Unions are useful if you have different kinds of messages, in which case you don't have to know in any intermediate levels the exact type. Only the sender and receiver need to parse the message actual message. Any other levels only really need to know the size and possibly sender and/or receiver info.

Makis
  • 12,468
  • 10
  • 62
  • 71
2

SDL uses an union for representing events: http://www.libsdl.org/cgi/docwiki.cgi/SDL_Event.

Bastien Léonard
  • 60,478
  • 20
  • 78
  • 95
1

do you mean something like this ?

union {
   long long a;
   unsigned char b[sizeof(long long)];
} long_long_to_single_bytes;

ADDED:

I have recently used this on our AIX machine to transform the 64bit machine-indentifier into a byte-array.

std::string getHardwareUUID(void) {
#ifdef AIX
   struct xutsname m; // aix specific struct to hold the 64bit machine id
   unamex(&b);        // aix specific call to get the 64bit machine id
   long_long_to_single_bytes.a = m.longnid;
   return convertToHexString(long_long_to_single_bytes.b, sizeof(long long));
#else // Windows or Linux or Solaris or ...
   ... get a 6byte ethernet MAC address somehow and put it into mac_buf
   return convertToHexString(mac_buf, 6);
#endif
  • yes but I want were it is practically applied, something like max_align – yesraaj Dec 23 '09 at 08:23
  • If you use something like this, you should take care of the order of bytes (endianess). Which means that this code may work on a machine (big endian) and doesn't work on the other machine(little endian) – Yousf Dec 23 '09 at 14:03
  • It work's on BE and on LE machines. Only the results are different :-) –  Dec 23 '09 at 17:30
  • @yesraaj: I have recently used this on our AIX machine to transform the 64bit machine-indentifier into a byte-array. Note: this machine-identifier is something AIX specific. –  Dec 23 '09 at 17:34
1

I've used sometimes unions this way

//Define type of structure
typedef enum { ANALOG, BOOLEAN, UNKNOWN } typeValue_t;
//Define the union
typedef struct  {
  typeValue_t typeValue;
  /*On this structure you will access the correct type of
    data according to its type*/
  union {
    float ParamAnalog;
    char  ParamBool;
  };
} Value_t;

Then you could declare arrays of different kind of values, storing more or less efficiently the data, and make some "polimorph" operations like:

 void printValue ( Value_t value ) {
    switch (value.typeValue) {
       case BOOL:
          printf("Bolean: %c\n", value.ParamBool?'T':'F');
          break;
       case ANALOG:
          printf("Analog: %f\n", value.ParamAnalog);
          break;
       case UNKNOWN:
          printf("Error, value UNKNOWN\n");
          break;
    }
 }
Craig McQueen
  • 41,871
  • 30
  • 130
  • 181
Khelben
  • 6,283
  • 6
  • 33
  • 46
1

Here is another example where a union could be useful.

(not my own idea, I have found this on a document discussing c++ optimizations)

begin-quote

.... Unions can also be used to save space, e.g.

first the non-union approach:

void F3(bool useInt) {
    if (y) {
        int a[1000];
        F1(a);  // call a function which expects an array of int as parameter
    }
    else {
        float b[1000];
        F2(b);  // call a function which expects an array of float as parameter
    }
}

Here it is possible to use the same memory area for a and b because their live ranges do not overlap. You can save a lot of cpu-cache space by joining a and b in a union:

void F3(bool useInt) {

    union {
        int a[1000];
        float b[1000];
    };

    if (y) {
        F1(a);  // call a function which expects an array of int as parameter
    }
    else {
        F2(b);  // call a function which expects an array of float as parameter
    }
}

Using a union is not a safe programming practice, of course, because you will get no warning from the compiler if the uses of a and b overlap. You should use this method only for big objects that take a lot of cache space. ...

end-qoute

1
  • When reading serialized data that needs to be coerced into specific types.
  • When returning semantic values from lex to yacc. (yylval)
  • When implementing a polymorphic type, especially one that reads a DSL or general language
  • When implementing a dispatcher that specifically calls functions intended to take different types.
DigitalRoss
  • 143,651
  • 25
  • 248
  • 329
1

Recently I think I saw some union used in vector programming. vector programming is used in intel MMX technology, GPU hardware, IBM's Cell Broadband Engine, and others.

a vector may correspond to a 128 bit register. It is very commonly used for SIMD architecture. since the hardware has 128-bit registers, you can store 4 single-precision-floating points in a register/variable. an easy way to construct, convert, extract individual elements of a vector is to use the union.

typedef union {
    vector4f vec; // processor-specific built-in type
    struct { // human-friendly access for transformations, etc
        float x;
        float y;
        float z;
        float w;
    };
    struct { // human-friendly access for color processing, lighting, etc
        float r;
        float g;
        float b;
        float a;
    };
    float arr[4]; // yet another convenience access
} Vector4f;

int main()
{
    Vector4f position, normal, color;
    // human-friendly access
    position.x = 12.3f;
    position.y = 2.f;
    position.z = 3.f;
    position.w = 1.f;

    // computer friendly access
    //some_processor_specific_operation(position.vec,normal.vec,color.vec);
    return 0;
}

if you take a path in PlayStation 3 Multi-core Programming, or graphics programming, a good chance you'll face more of these stuffs.

Afriza N. Arief
  • 7,696
  • 5
  • 47
  • 74
1

I know I'm a bit late to the party, but as a practical example the Variant datatype in VBScript is, I believe, implemented as a Union. The following code is a simplified example taken from an article otherwise found here

struct tagVARIANT
{
    union 
    {
        VARTYPE vt;
        WORD wReserved1;
        WORD wReserved2;
        WORD wReserved3;
        union 
        {
            LONG lVal;
            BYTE bVal;
            SHORT iVal;
            FLOAT fltVal;
            DOUBLE dblVal;
            VARIANT_BOOL boolVal;
            DATE date;
            BSTR bstrVal;
            SAFEARRAY *parray;
            VARIANT *pvarVal;
        };
    };
};

The actual implementation (as the article states) is found in the oaidl.h C header file.

0

Example:

When using different socket types, but you want a comon type to refer.

Xolve
  • 22,298
  • 21
  • 77
  • 125
0

Another example more: to save doing castings.

typedef union {
  long int_v;
  float float_v;
} int_float;

void foo(float v) {
  int_float i;
  i.float_v = v;
  printf("sign=%d exp=%d fraction=%d", (i.int_v>>31)&1, ((i.int_v>>22)&0xff)-128, i.int_v&((1<<22)-1));
}

instead of:

void foo(float v) {
  long i = *((long*)&v);
  printf("sign=%d exp=%d fraction=%d", (i>>31)&1, ((i>>22)&0xff)-128, i&((1<<22)-1));
}
fortran
  • 74,053
  • 25
  • 135
  • 175
  • 3
    using a union to cast like this is technically undefined behaviour (although i'm unaware of any implementations which don't actually do what you want here) – jk. Dec 23 '09 at 09:01
  • A common term for this technique is "type punning" – fbrereto Dec 23 '09 at 20:02
  • @jk The very same undefined behaviour of getting the address, casting to a int pointer and then getting the contents. The only thing undefined here should be the endianness of the float, that is not specified by any standard. – fortran Dec 29 '09 at 10:03
  • This technique blows up when your longs are 64 bit and your floats are not. e.g. linux 64 bit. – nos Jan 29 '10 at 19:25
  • @nos it won't "blow up", you'll keep getting the binary value of the float in either the lower or the upper part of the int (depending on the endiannes) and in the other, maybe trash... you can use a fixed size type if you wish, but this was just an example, not production quality code. – fortran Jan 30 '10 at 08:55
  • I consider my applicatons to "blow up" when it starts producing or processing garbage :-) – nos Jan 30 '10 at 11:02
0

For convenience, I use unions to let me use the same class to store xyzw and rgba values

#ifndef VERTEX4DH
    #define VERTEX4DH

    struct Vertex4d{

        union {
            double x;
            double r;
        };
        union {
            double y;
            double g;
        };
        union {
            double z;
            double b;
        };
        union {
            double w;
            double a;
        };

        Vertex4d(double x=0, double y=0,double z=0,double w=0) : x(x), y(y),z(z),w(w){}
    };

#endif
Tom J Nowell
  • 9,588
  • 17
  • 63
  • 91
0

Many examples of unions can be found in <X11/Xlib.h>. Few others are in some IP stacks (in BSD <netinet/ip.h> for instance).

As a general rule, protocol implementations use union construct.

mouviciel
  • 66,855
  • 13
  • 106
  • 140
0

Unions can also be useful when type punning, which is desirable in a select few places (such as some techniques for floating-point comparison algorithms).

fbrereto
  • 35,429
  • 19
  • 126
  • 178