1

Can anyone please explain what is type punning in C and demonstrate when such problems occurs with a simple example program?

I have searched in many websites (even wiki) but even then I couldn't get clear idea.

Embedded_User
  • 211
  • 2
  • 12

1 Answers1

7

Type Punning is a broad concept that applies to just about any language with a type system and a little bit of flexibility so I'd use Wikipedia's example with Berkeley sockets:

From the wikipedia page:

One classic example of type punning is found in the Berkeley sockets interface. The function to bind an opened but uninitialized socket to an IP address is declared as follows:

int bind(int sockfd, struct sockaddr *my_addr, socklen_t addrlen);

The bind function is usually called as follows:

struct sockaddr_in sa = {0};
int sockfd = ...;
sa.sin_family = AF_INET;
sa.sin_port = htons(port);
bind(sockfd, (struct sockaddr *)&sa, sizeof sa);

The Berkeley sockets library fundamentally relies on the fact that in C, a pointer to struct sockaddr_in is freely convertible to a pointer to struct sockaddr; and, in addition, that the two structure types share the same memory layout. Therefore, a reference to the structure field my_addr->sin_family (where my_addr is of type struct sockaddr*) will actually refer to the field sa.sin_family (where sa is of type struct sockaddr_in). In other words, the sockets library uses type punning to implement a rudimentary form of inheritance.

Often seen in the programming world is the use of "padded" data structures to allow for the storage of different kinds of values in what is effectively the same storage space. This is often seen when two structures are used in mutual exclusivity for optimization.

Edit: I didn't notice the edit where you mentioned trying Wikipedia. I think what you should take away in that case is what I say in the first sentence, namely "Type Punning is a broad concept that applies to just about any language with a type system and a little bit of flexibility". If you're having trouble with it I'd say look for more examples and implementations of the strategy (maybe look at OOP in C to get a little bit more of some of the concepts involved [it's not exactly the same per se])

Another Edit: It occurred to me maybe you meant type punning in the context of unions, so here's a modified example from this question that asked the purpose of unions (working backwards here):

union RGB
{
    uint32_t color;

    struct componentsTag
    {
        uint8_t b;
        uint8_t g;
        uint8_t r;
    } components;

} pixel;

pixel.color = 0x020406;
uint8_t rVal = pixel.components.r; //this will equal 02
uint8_t gVal = pixel.components.g; //this will equal 04
uint8_t bVal = pixel.components.b; //this will equal 06

Run this example online here: https://onlinegdb.com/H1Sfm6p8E

Here type punning is being used to allow access to the individual values of each color without C's type conversions. You might wonder how this works. In the memory the union takes up 32 bits. When color is set with the line pixel.color = 0x020406, these 32 bits are filled with the value 0x00020406 (each pair of values after the 0x takes up 8 bits (8*4 = 32 bits), and the values are packed from the hex number right to left, so 06 goes into the first (least significant) byte, which is b, 04 goes into the 2nd byte, which is g, and 02 goes into the 3rd (most significant of the three) byte, which is b. The most-significant byte, 00, of the 4-byte hex string, goes into nothing, as the union does not contain enough bytes to store it. This value could be alpha if you added uint8_t alpha to the struct as a new line just below uint8_t r.

A diagram splitting these 32 bits into parts might look like this:

------------------------------------
-----| 00  | 02  | 04  | 06  |------
------------------------------------
     |    uint32_t color     |
------------------------------------

But the 24-bit components struct also takes up the same memory: namely, the 3 right-most bytes (24 least significant bits) in the same space.

So now the full diagram for the union's memory is:

------------------------------------
-----| 00  | 02  | 04  | 06  |------
------------------------------------
     |    uint32_t color     |
------------------------------------
-----| NA  |  r  |  g  |  b  |------
------------------------------------

Note how r,g, and b, all overlap color. Accessing r,g, or b now access a specific part of color, an 8 bit part of it. Normally converting a uint32_t to a uint8_t would simply give you the least significant bits of the uint32_t, so r,g, and b would all become meaningless numbers. But as I said before, here the union is being used for type punning, so the conversion the standard defines is being circumvented.

(major edits & corrections by Gabriel Staples, 6 Mar. 2019)

Gabriel Staples
  • 36,492
  • 15
  • 194
  • 265
Selali Adobor
  • 2,060
  • 18
  • 30
  • I don't understand how 8*3 = 32 bits ? – Michael Heidelberg Dec 03 '15 at 12:05
  • 1
    I was using a game programming example and forgot the alpha value, which makes for 32 – Selali Adobor Dec 03 '15 at 15:26
  • 1
    When I get a chance I'll add alpha, I was about to change it to 24, but uint24_t might just add a little bit of confusion since it doesn't exist in the standards – Selali Adobor Dec 03 '15 at 22:39
  • Is the part of the `union` defined behaviour? Because the `struct` could have padding, right? I've never really understood this completely. – alx - recommends codidact Jan 30 '19 at 17:47
  • @SelaliAdobor, I think your answer is really a valuable addition to the community and has a really great example, but had some parts that were confusing or fundamentally wrong, so I want to let you know I just made some major edits to your answer to clean it up and make it all correct. I hope that's ok with you. I think this is now truly a great answer. If you like this, awesome. If not, I will move my edits into my own answer, but I don't think that is necessary. – Gabriel Staples Mar 06 '19 at 22:28