Why is a cast of uintptr_t used here?

Question

Here it is, in the context of pthread.h and stdint.h:

struct arguments {
    uint32_t threads;
    uint32_t size;
};

void *run_v1(void *arg) {
    uint32_t thread = (uintptr_t) arg;
    for (uint32_t j = 0; j < arguments.size; ++j) {
        size_t global_index = get_global_index(thread, j);
        char *string = get_string(global_index);
        hash_table_v1_add_entry(hash_table_v1, string, global_index);
    }
    return NULL;
}

...
int main () {
    ...
    for (uintptr_t i = 0; i < arguments.threads; ++i) {
        int err = pthread_create(&threads[i], NULL, run_v1, (void*) i);
        if (err != 0) {
            printf("pthread_create returned %d\n", err);
            return err;
        }
    }
   ...
}

This is my professor's code, I read the specification here:

The following type designates an unsigned integer type with the property that any valid pointer to void can be converted to this type, then converted back to a pointer to void, and the result will compare equal to the original pointer: uintptr_t

Why is the cast to this type necessary, rather than passing in a uint32_t and casting to itself? Is this undefined behavior?

Because you can't implicitly convert a pointer to a non-pointer type. You actually need *two* casts to be fully correct: `uint32_t thread = (uint32_t) (uintptr_t) arg;`. Then on the other end you need the `pthread_create` call to do the opposite two casts: `pthread_create(..., (void *) (uintptr_t) some_uint32_t_value)` — Some programmer dude, May 06 '23 at 20:05
I've seen some professors write code like the above but instead it's `int val = (int) arg;`. You're saying this is incorrect? @Someprogrammerdude — user129393192, May 06 '23 at 20:28
And what difference does it exactly make? I thought the whole point of C is that it's all just bit representations you can do what you like with. You stated 'implicitly', but aren't these casts **explicit**? — user129393192, May 06 '23 at 20:29
There is not sufficient code or context to say why this cast is used. It could be only the low 32 bits of an address are needed. It could be some API that takes an address was being used but the author wanted to pass a 32-bit value, so they converted the 32-bit value to `void *` and passed that, and this code is converting it back, and it uses the intermediate cast to `uintptr_t` to avoid a compiler warning. (In this case, with threads, the proper method is to pass a pointer to an `uint32_t`, not to smuggle an integer value through a `void *`.) The question ought to be updated with context. — Eric Postpischil, May 06 '23 at 20:37
I doubt you find a compiler or environment where the casts shown in your code won't work. But it's not correct if one should pedantically follow the C standard. — Some programmer dude, May 06 '23 at 20:45
By the way, did your teacher also writhe the `or (uintptr_t i = 0; ...)` loop? Then the cast you wonder about (inside the thread function) makes a *little* more sense, but should still be casted since you're not using a `uintptr_t` type for your variable in the function. With that said, using `uintptr_t` for the loop variable? That's just bad coding. Either use `size_t` or plain `unsigned`. — Some programmer dude, May 06 '23 at 20:48
Yea that's sort of what I was wondering about @Someprogrammerdude . I'm wondering what his intentions were. This is actually code from a former professor whose projects we use, not my current. I was just reading and analyzing and this came up and I got curious — user129393192, May 06 '23 at 21:33
@user129393192, Code should not cast nor use `uint32_t` here. Code has various weaknesses that a [mcve] would illuminate. Without a [mcve]`, the "Why is the cast to this type necessary" rational is, at best, incomplete. — chux - Reinstate Monica, May 07 '23 at 03:45

score 1 · Answer 1 · answered May 06 '23 at 20:07

1

The cast guarantees that the pointer be converted to an unsigned integer type in a reversible way.

In case this type is larger than a uint32_t, a compiler warning will be raised.

A direct cast to uint32_t might hide the fact that the conversion is not reversible. (Though, honestly, I cannot think of a implementation causing that.)

answered May 06 '23 at 20:07

Yves Daoust

672
9

"In case this type is larger than a uint32_t, a compiler warning will be raised." is a very good point. – chux - Reinstate Monica May 07 '23 at 03:34
"I cannot think of a implementation causing that." --> wait a few years. [Moore's law](https://en.wikipedia.org/wiki/Moore%27s_law) will eventually break OP's/professor's short-sighted code. Code should be `uintptr_t thread = ...` – chux - Reinstate Monica May 07 '23 at 03:40
@chux-ReinstateMonica: I don't get your second remark. This code is already "broken" on PC's, which are all 64 bits now. But there is little reason that address spaces will go above 64 bits any day soon (this exceeds the total memory of the largest existing supercomputer). – Yves Daoust May 07 '23 at 06:02
"don't get your second remark." & "exceeds the total memory of the largest existing supercomputer)" --> I remember the first computer I used that had 4Gbyte of memory, breaking the 32-bit address. Exceeding [64-bit mmeory](https://www.researchgate.net/figure/Storage-capacity-trend-of-emerging-nonvolatile-memories_fig15_280222690) might take 30 years, but we'll get there. – chux - Reinstate Monica May 07 '23 at 15:15
@chux-ReinstateMonica: C will be dead by then, and Moore's law obsolete. – Yves Daoust May 07 '23 at 15:27
Yves Daoust, Thought that 30 years ago too - yet here we are. Perhaps you will be right. Maybe your prior experience has insight not readily seen. – chux - Reinstate Monica May 07 '23 at 18:10

score 1 · Answer 2 · answered May 08 '23 at 20:00

Why is the cast to this type necessary,

It is not.

rather than passing in a uint32_t and casting to itself?

That's what I would do.

Is this undefined behavior?

Maybe. But it definitely relies on implementation-defined behavior.

There are both general principles and problem-specific ones to consider here.

The most relevant general principle is the definition of uintptr_t, which you quoted. It tells you that uintptr_t can represent a distinct value corresponding to each distinct, valid void * value, so you can be confident that converting a void * to type uintptr_t will not produce a loss of fidelity. In general, then, if you want to represent an object pointer as an integer, uintptr_t is the integer type to choose.

It is relatively common to conclude that uintptr_t must be the same size as a void *, but although that's often true, the language spec places no such requirement. Since uintptr_t needs only to provide distinct representations for valid pointer values, and also because distinct void * bit patterns don't have to represent distinct addresses, uintptr_t could conceivably be smaller than void *. On the other hand, it can fulfill its role just fine if it is larger than void *.

Moreover, the language spec requires that you can round-trip pointers through type uintptr_t, but it does not require that you can round-trip any variety of integer through a pointer. The results of most integer-to-pointer conversions are implementation defined. That is, given this ...

    uintptr_t x;
    // .. assign a value to x ...

... the language spec allows this to print "unequal":

    if (x == (uintptr_t)(void *) x) {
        puts("equal");
    } else {
        puts("unequal");
    }

But in this specific case,

the upper bound on the values to be conveyed is read from an object of type uint32_t, and therefore all values to be conveyed are representable by that type; and
the program is assuming a C implementation in which the integer --> pointer --> integer transit reproduces the original value for all the integer values to be conveyed.

Under these circumstances, language semantics present no reason to prefer uintptr_t over uint32_t as the integer type involved. That is, if the code presented works correctly, then a version in which uinptr_t is replaced replaced with uint32_t must also work correctly. And I find the latter alternative cleaner and clearer.

In what circumstance would the example you presented print unequal? Is that if the `uintptr_t` can hold values that are not representable as a `void*`? As for the latter, are you saying you would simply pass in `(..., (void*) x)` where `x` is of type `uint32_t` and then recast that as `uint32_t val = (uint32_t) void_ptr_param`? — user129393192, May 08 '23 at 20:22
@user129393192, again, integer-to-pointer conversion is *implementation defined*, so the sky's the limit. But one not-too-implausible way would involve a system that uses a segment:offset addressing scheme with overlapping segments. Integer-to-pointer conversion expands the integer to some fixed width, then interprets half the integer as the segment, the other half as the offset. On conversion back to integer, it produces a normalized representation with the offset less than the segment size. — John Bollinger, May 08 '23 at 20:32
I see. I don’t quite understand your example, but I get your point. As for the assumption that integers can be represented in pointers, does that mean that the safest, most by-the-standard way is to pass a pointer to an integer? I imagine this would have race conditions. — user129393192, May 08 '23 at 23:36
@user129393192, yes, the safest way is to pass a pointer to an integer, which the thread then dereferences to get the wanted value. Yes, if done wrong, it can produce data races. It's often best to pass a different pointer to each thread -- for example, each pointing to a different element of an array. If that's not viable, then it's also possible to use semaphores or a CV + mutex to ensure that each new thread reads the value before it is updated for the next thread. — John Bollinger, May 08 '23 at 23:49
Got it, so based on what you have said, that is the only standard compliant way, and the "smuggling" of values through pointers can lead to issues -- though not likely on most machines and implementations. — user129393192, May 09 '23 at 00:14

Ted Lyngmo · Answer 3 · 2023-05-07T07:06:22.267

Why is the cast to this type necessary, rather than a cast to uint32_t?

It is not, but if uint32_t is smaller than uintptr_t you may get a warning about "cast from pointer to integer of different size". uintptr_t on the other hand is defined to be able to store pointer values as integers.

When the cast to uintptr_t is done you may still get a warning about "implicit conversion loses integer precision", so if what you've stored in the void* was actually an uint32_t to start with, add a cast to not get that potential warning:

uint32_t thread = (uint32_t)(uintptr_t)arg;

However, I suggest sending in the value via a pointer and then you wouldn't need any explicit casts:

void *run(void *arg) {
    uint32_t *ptr = arg;
    uint32_t thread = *ptr;
    ...
    return NULL;
}

uint32_t value;
pthread_create(..., &value);

A more elaborate example of making use of passing a pointer which is easy to extend if your thread needs more data than a single uint32_t could look like this:

#include <pthread.h>
#include <stdint.h>
#include <stddef.h>

#define SIZE(x) (sizeof (x) / sizeof *(x))

typedef struct {
    uint32_t value;
    // other data that the thread needs can be added here
} thread_data;

void *run(void *arg) {
    thread_data *data = arg;
    // work with data here
    return NULL;
}

int main() {
    pthread_t th[10];
    thread_data data[SIZE(th)];
    
    for(int i = 0; i < SIZE(th); ++i) {
        // fill data[i] with values to work with
        pthread_create(&th[i], NULL, run, &data[i]);
    }

    // ... join etc ...
}

This of course assumes that the `pthread_create` call was passed a *pointer* to a `uint32_t` value. Which is not needed. And can lead to data-races. — Some programmer dude, May 06 '23 at 20:07
@Someprogrammerdude _"This of course assumes that the pthread_create call was passed a pointer to a `uint32_t` value"_ - Yes, that's a change I propose. Races can happen if not handled properly for sure. — Ted Lyngmo, May 06 '23 at 20:10
Why exactly is all this casting stuff necessary? I had previously thought the whole point of an explicit cast was to convert the value and stop the compiler warning, truncating the value if necessary or converting to floating point in some circumstances? — user129393192, May 06 '23 at 20:32
I also was taught the point of a `void*` is that it can be cast to anything, so this is all news to me. — user129393192, May 06 '23 at 20:32
@user129393192 In C any *pointer* is implicitly convertible to `void *`. And `void *` is implicitly convertible to any pointer. With the exception of function pointers, those aren't implicitly or explicitly convertible to or from `void *`. — Some programmer dude, May 06 '23 at 20:39

Why is a cast of uintptr_t used here?

3 Answers3