C: Where is union practically used?

Question

I have a example with me where in which the alignment of a type is guaranteed, union max_align . I am looking for a even simpler example in which union is used practically, to explain my friend.

I usually see union used to achieve a poor man's form of polymorphism. — E.M., Dec 23 '09 at 08:29
Why not use the google code search? Leads you to tons of practical examples :) http://www.google.com/codesearch?hl=de&sa=N&q=union++lang:c&ct=rr&cs_r=lang:c — Christian, Dec 23 '09 at 08:29

score 33 · Accepted Answer · edited Mar 20 '15 at 13:11

I usually use unions when parsing text. I use something like this:

typedef enum DataType { INTEGER, FLOAT_POINT, STRING } DataType ;

typedef union DataValue
{
    int v_int;
    float v_float;
    char* v_string;
}DataValue;

typedef struct DataNode
{
    DataType type;
    DataValue value;
}DataNode;

void myfunct()
{
    long long temp;
    DataNode inputData;

    inputData.type= read_some_input(&temp);

    switch(inputData.type)
    {
        case INTEGER: inputData.value.v_int = (int)temp; break;
        case FLOAT_POINT: inputData.value.v_float = (float)temp; break;
        case STRING: inputData.value.v_string = (char*)temp; break;
    }
}

void printDataNode(DataNode* ptr)
{
   printf("I am a ");
   switch(ptr->type){
       case INTEGER: printf("Integer with value %d", ptr->value.v_int); break;
       case FLOAT_POINT: printf("Float with value %f", ptr->value.v_float); break;
       case STRING: printf("String with value %s", ptr->value.v_string); break;
   }
}

If you want to see how unions are used HEAVILY, check any code using flex/bison. For example see splint, it contains TONS of unions.

ie. Tagged unions with pattern matching in c. Do lots of people use unions without ad-hoc type tagging? — Roman A. Taycher, Dec 02 '10 at 23:28

score 6 · Answer 2 · edited Dec 21 '22 at 08:53

6

I've typically used unions where you want to have different views of the data e.g. a 32-bit colour value where you want both the 32-bit val and the red,green,blue and alpha components

struct rgba
{
  unsigned char r;
  unsigned char g;
  unsigned char b;
  unsigned char a;
};

union  
{
  unsigned int val;
  struct rgba components;
}colorval32;

NB You could also achieve the same thing with bit-masking and shifting i.e

#define GETR(val) ((val&0xFF000000) >> 24)

but I find the union approach more elegant

edited Dec 21 '22 at 08:53

chirag

3
3

answered Dec 23 '09 at 08:34

zebrabox

5,694
1
28
32

2

But `struct rgba` may have padding, and even without padding, `sizeof(unsigned int)` may or may not be equal to `sizeof(struct rgba)` (4 for no padding). I.e., you probably don't want to do this. – Alok Singhal Dec 23 '09 at 15:52
3

@Alok: Not to mention that the `struct` will have `r`, `g`, `b`, and `a` in order (with or without padding), while the `unsigned int` byte order will depend on endianness. In other words, this code is not portable and may fail in odd ways when compiling on another processor. – David Thornley Dec 23 '09 at 18:44
7

Do note that (standard) C does not support this - assigning to one member in a union and then reading from another member is undefined behavior. Compilers/systems implementations do commonly support this though in their own specific ways. – nos Dec 23 '09 at 20:13
1

Thanks for the comments all. As you say this isn't portable ( nor is it intended to be) and relies on many assumptions which I didn't make clear in the post. Typically I wouldn't be using unsigned int but something like a typedef uint32_t i.e where I knew it was 32-bits. I'll amend my post to make this clearer – zebrabox Dec 24 '09 at 14:00
In addition to the portability issues, it's also a great way to confuse the compiler with potential aliasing, forcing it to skip a load of otherwise useful optimizations. – jalf Jan 29 '10 at 19:09
about what nos said; I think he means this is undefined behavior because of the strict aliasing rule. this should work when you use char as your "other" union types. – v.oddou Jun 29 '14 at 08:50

score 5 · Answer 3 · answered Dec 23 '09 at 08:33

5

For accessing registers or I/O ports bytewise as well as bitwise by mapping that particular port to memory, see the example below:

    typedef Union
{
  unsigned int a;
struct {
  unsigned bit0 : 1,
           bit1 : 1,
           bit2 : 1,
           bit3 : 1,
           bit4 : 1,
           bit5 : 1,
           bit6 : 1,
           bit7 : 1,
           bit8 : 1,
           bit9 : 1,
           bit10 : 1,
           bit11 : 1,
           bit12 : 1,
           bit13 : 1,
           bit14 : 1,
           bit15 : 1
} bits;
} IOREG;

# define PORTA (*(IOREG *) 0x3B)
...
unsigned int i = PORTA.a;//read bytewise
int j = PORTA.bits.bit0;//read bitwise
...
PORTA.bits.bit0 = 1;//write operation

answered Dec 23 '09 at 08:33

wrapperm

1,266
12
18

6

Never use this if you have to handle big/little endian conversions – mouviciel Dec 23 '09 at 09:02
1

@mouviciel: This is not affected by endianness... can you please support your comment, if so..? – wrapperm Dec 23 '09 at 11:13
8

From the C standard: "The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified." – Alok Singhal Dec 23 '09 at 15:56
1

However, in many cases registers or I/O ports *are* platform-specific, so platform-independent code isn't such a concern. Though I suppose you might write a driver for some hardware that is used on both big-endian and little-endian machines (e.g. a Linux driver for a graphics chip)... in that case you'd know what you're doing I'm sure. – Craig McQueen Dec 24 '09 at 00:04
1

The alignment of the addressable storage unit can be easily determined by a simple program... actually speaking you should have an idea about the architectural endianness and stuff like that before coding for a particular platform... – wrapperm Dec 24 '09 at 07:31
1

This misses a disclaimer that says "Reading from inactive union members yields undefined behavior". – Sebastian Mach Jul 08 '11 at 08:44

Nate Kohl · Answer 4 · 2009-12-24T13:41:37.550

In the Windows world, unions are commonly used to implement tagged variants, which are (or were, before .NET?) one standard way of passing data between COM objects.

The idea is that a union type can provide a single natural interface for passing arbitrary data between two objects. Some COM object could pass you a variant (e.g. type VARIANT or _variant_t) which could contain either a double, float, int, or whatever.

If you have to deal with COM objects in Windows C++ code, you'll see variant types all over the place.

score 2 · Answer 5 · answered Dec 23 '09 at 08:17

2

struct cat_info
{
int legs;
int tailLen;
};

struct fish_info
{
bool hasSpikes;
};


union 
{
fish_info fish;
cat_info cat;
} animal_data;

struct animal
{
char* name;
int animal_type;
animal_data data;
};

answered Dec 23 '09 at 08:17

ironic

8,368
7
35
44

I know how to use union, my friend does too, but I want were it is used most commonly – yesraaj Dec 23 '09 at 08:27

score 2 · Answer 6 · answered Dec 23 '09 at 08:58

2

Unions are useful if you have different kinds of messages, in which case you don't have to know in any intermediate levels the exact type. Only the sender and receiver need to parse the message actual message. Any other levels only really need to know the size and possibly sender and/or receiver info.

answered Dec 23 '09 at 08:58

Makis

12,468
10
62
71

Isn't this what polymorphism is for? A base class for the common message content, then each message is a specialization of the base class. – Thomas Matthews Dec 23 '09 at 22:30
1

@Thomas: you don't have classes in C. – Adriano Varoli Piazza Dec 24 '09 at 13:51

score 2 · Answer 7 · answered Jan 29 '10 at 18:25

2

SDL uses an union for representing events: http://www.libsdl.org/cgi/docwiki.cgi/SDL_Event.

answered Jan 29 '10 at 18:25

Bastien Léonard

60,478
20
78
95

score 1 · Answer 8 · 2009-12-23T18:01:03.437

1

do you mean something like this ?

union {
   long long a;
   unsigned char b[sizeof(long long)];
} long_long_to_single_bytes;

ADDED:

I have recently used this on our AIX machine to transform the 64bit machine-indentifier into a byte-array.

std::string getHardwareUUID(void) {
#ifdef AIX
   struct xutsname m; // aix specific struct to hold the 64bit machine id
   unamex(&b);        // aix specific call to get the 64bit machine id
   long_long_to_single_bytes.a = m.longnid;
   return convertToHexString(long_long_to_single_bytes.b, sizeof(long long));
#else // Windows or Linux or Solaris or ...
   ... get a 6byte ethernet MAC address somehow and put it into mac_buf
   return convertToHexString(mac_buf, 6);
#endif

edited Dec 23 '09 at 18:01

answered Dec 23 '09 at 08:00

yes but I want were it is practically applied, something like max_align – yesraaj Dec 23 '09 at 08:23
If you use something like this, you should take care of the order of bytes (endianess). Which means that this code may work on a machine (big endian) and doesn't work on the other machine(little endian) – Yousf Dec 23 '09 at 14:03
It work's on BE and on LE machines. Only the results are different :-) – Dec 23 '09 at 17:30
@yesraaj: I have recently used this on our AIX machine to transform the 64bit machine-indentifier into a byte-array. Note: this machine-identifier is something AIX specific. – Dec 23 '09 at 17:34

score 1 · Answer 9 · edited Dec 23 '09 at 23:59

I've used sometimes unions this way

//Define type of structure
typedef enum { ANALOG, BOOLEAN, UNKNOWN } typeValue_t;
//Define the union
typedef struct  {
  typeValue_t typeValue;
  /*On this structure you will access the correct type of
    data according to its type*/
  union {
    float ParamAnalog;
    char  ParamBool;
  };
} Value_t;

Then you could declare arrays of different kind of values, storing more or less efficiently the data, and make some "polimorph" operations like:

 void printValue ( Value_t value ) {
    switch (value.typeValue) {
       case BOOL:
          printf("Bolean: %c\n", value.ParamBool?'T':'F');
          break;
       case ANALOG:
          printf("Analog: %f\n", value.ParamAnalog);
          break;
       case UNKNOWN:
          printf("Error, value UNKNOWN\n");
          break;
    }
 }

score 1 · Answer 10 · 2009-12-23T19:30:34.970

Here is another example where a union could be useful.

(not my own idea, I have found this on a document discussing c++ optimizations)

begin-quote

.... Unions can also be used to save space, e.g.

first the non-union approach:

void F3(bool useInt) {
    if (y) {
        int a[1000];
        F1(a);  // call a function which expects an array of int as parameter
    }
    else {
        float b[1000];
        F2(b);  // call a function which expects an array of float as parameter
    }
}

Here it is possible to use the same memory area for a and b because their live ranges do not overlap. You can save a lot of cpu-cache space by joining a and b in a union:

void F3(bool useInt) {

    union {
        int a[1000];
        float b[1000];
    };

    if (y) {
        F1(a);  // call a function which expects an array of int as parameter
    }
    else {
        F2(b);  // call a function which expects an array of float as parameter
    }
}

Using a union is not a safe programming practice, of course, because you will get no warning from the compiler if the uses of a and b overlap. You should use this method only for big objects that take a lot of cache space. ...

end-qoute

score 1 · Answer 11 · answered Dec 24 '09 at 17:19

When reading serialized data that needs to be coerced into specific types.
When returning semantic values from lex to yacc. (yylval)
When implementing a polymorphic type, especially one that reads a DSL or general language
When implementing a dispatcher that specifically calls functions intended to take different types.

Afriza N. Arief · Answer 12 · 2010-01-29T18:45:15.010

Recently I think I saw some union used in vector programming. vector programming is used in intel MMX technology, GPU hardware, IBM's Cell Broadband Engine, and others.

a vector may correspond to a 128 bit register. It is very commonly used for SIMD architecture. since the hardware has 128-bit registers, you can store 4 single-precision-floating points in a register/variable. an easy way to construct, convert, extract individual elements of a vector is to use the union.

typedef union {
    vector4f vec; // processor-specific built-in type
    struct { // human-friendly access for transformations, etc
        float x;
        float y;
        float z;
        float w;
    };
    struct { // human-friendly access for color processing, lighting, etc
        float r;
        float g;
        float b;
        float a;
    };
    float arr[4]; // yet another convenience access
} Vector4f;

int main()
{
    Vector4f position, normal, color;
    // human-friendly access
    position.x = 12.3f;
    position.y = 2.f;
    position.z = 3.f;
    position.w = 1.f;

    // computer friendly access
    //some_processor_specific_operation(position.vec,normal.vec,color.vec);
    return 0;
}

if you take a path in PlayStation 3 Multi-core Programming, or graphics programming, a good chance you'll face more of these stuffs.

Do note that writing to one member of a union then reading from another leads to undefined behavior. — GManNickG, Jan 29 '10 at 18:53

score 1 · Answer 13 · answered Nov 25 '10 at 13:04

I know I'm a bit late to the party, but as a practical example the Variant datatype in VBScript is, I believe, implemented as a Union. The following code is a simplified example taken from an article otherwise found here

struct tagVARIANT
{
    union 
    {
        VARTYPE vt;
        WORD wReserved1;
        WORD wReserved2;
        WORD wReserved3;
        union 
        {
            LONG lVal;
            BYTE bVal;
            SHORT iVal;
            FLOAT fltVal;
            DOUBLE dblVal;
            VARIANT_BOOL boolVal;
            DATE date;
            BSTR bstrVal;
            SAFEARRAY *parray;
            VARIANT *pvarVal;
        };
    };
};

The actual implementation (as the article states) is found in the oaidl.h C header file.

score 0 · Answer 14 · answered Dec 23 '09 at 08:00

0

Example:

When using different socket types, but you want a comon type to refer.

answered Dec 23 '09 at 08:00

Xolve

22,298
21
77
125

score 0 · Answer 15 · answered Dec 23 '09 at 08:30

0

Another example more: to save doing castings.

typedef union {
  long int_v;
  float float_v;
} int_float;

void foo(float v) {
  int_float i;
  i.float_v = v;
  printf("sign=%d exp=%d fraction=%d", (i.int_v>>31)&1, ((i.int_v>>22)&0xff)-128, i.int_v&((1<<22)-1));
}

instead of:

void foo(float v) {
  long i = *((long*)&v);
  printf("sign=%d exp=%d fraction=%d", (i>>31)&1, ((i>>22)&0xff)-128, i&((1<<22)-1));
}

answered Dec 23 '09 at 08:30

fortran

74,053
25
135
175

3

using a union to cast like this is technically undefined behaviour (although i'm unaware of any implementations which don't actually do what you want here) – jk. Dec 23 '09 at 09:01
A common term for this technique is "type punning" – fbrereto Dec 23 '09 at 20:02
@jk The very same undefined behaviour of getting the address, casting to a int pointer and then getting the contents. The only thing undefined here should be the endianness of the float, that is not specified by any standard. – fortran Dec 29 '09 at 10:03
This technique blows up when your longs are 64 bit and your floats are not. e.g. linux 64 bit. – nos Jan 29 '10 at 19:25
@nos it won't "blow up", you'll keep getting the binary value of the float in either the lower or the upper part of the int (depending on the endiannes) and in the other, maybe trash... you can use a fixed size type if you wish, but this was just an example, not production quality code. – fortran Jan 30 '10 at 08:55
I consider my applicatons to "blow up" when it starts producing or processing garbage :-) – nos Jan 30 '10 at 11:02

score 0 · Answer 16 · answered Dec 23 '09 at 08:40

For convenience, I use unions to let me use the same class to store xyzw and rgba values

#ifndef VERTEX4DH
    #define VERTEX4DH

    struct Vertex4d{

        union {
            double x;
            double r;
        };
        union {
            double y;
            double g;
        };
        union {
            double z;
            double b;
        };
        union {
            double w;
            double a;
        };

        Vertex4d(double x=0, double y=0,double z=0,double w=0) : x(x), y(y),z(z),w(w){}
    };

#endif

score 0 · Answer 17 · answered Dec 23 '09 at 09:26

0

Many examples of unions can be found in <X11/Xlib.h>. Few others are in some IP stacks (in BSD <netinet/ip.h> for instance).

As a general rule, protocol implementations use union construct.

answered Dec 23 '09 at 09:26

mouviciel

66,855
13
106
140

score 0 · Answer 18 · answered Dec 23 '09 at 19:59

0

Unions can also be useful when type punning, which is desirable in a select few places (such as some techniques for floating-point comparison algorithms).

answered Dec 23 '09 at 19:59

fbrereto

35,429
19
126
178

Though that's using undefined behaviour, even if widely supported. – Sebastian Mach Jul 08 '11 at 08:45

C: Where is union practically used?

18 Answers18

Linked

Related