5

An external API expects a pointer to an array of values (int as simple example here) plus a size.

It is logically clearer to deal with the elements in groups of 4.

So process elements via a "group of 4" struct and then pass the array of those structs to the external API using a pointer cast. See code below.

Spider sense says: "strict aliasing violation" in the reinterpret_cast => possible UB?

  1. Are the static_asserts below enough to ensure: a) this works in practice b) this is actually standards compliant and not UB?

  2. Otherwise, what do I need to do, to make it "not UB". A union? How exactly please?

  3. or, is there overall a different, better way?


#include <cstddef>

void f(int*, std::size_t) {
    // external implementation
    // process array
}

int main() {

    static constexpr std::size_t group_size    = 4;
    static constexpr std::size_t number_groups = 10;
    static constexpr std::size_t total_number  = group_size * number_groups;

    static_assert(total_number % group_size == 0);

    int vals[total_number]{};

    struct quad {
        int val[group_size]{};
    };

    quad vals2[number_groups]{};
    // deal with values in groups of four using member functions of `quad`

    static_assert(alignof(int) == alignof(quad));
    static_assert(group_size * sizeof(int) == sizeof(quad));
    static_assert(sizeof(vals) == sizeof(vals2));

    f(vals, total_number);
    f(reinterpret_cast<int*>(vals2), total_number); /// is this UB? or OK under above asserts?
}

Oliver Schönrock
  • 1,038
  • 6
  • 11
  • "*It is logically clearer to deal with the elements in groups of 4.*" But that's not what the API asked for, is it? Regardless of whether it's "logically clearer" to *you*, you should provide what the API *asked for*. – Nicol Bolas Dec 18 '22 at 19:59
  • @NicolBolas The API actually takes a third param, an enum which indicates "groups of 4".. so yeah, it is logical to me and the API – Oliver Schönrock Dec 18 '22 at 20:01
  • The API says "array of `int`"s. So that's what you provide. Other parameters can't change that. – Nicol Bolas Dec 18 '22 at 20:01
  • Casting a pointer to a single quad into a pointer to int *might* be ok, as the first member of the quad is an int. But moving from one quad to the next using that pointer is not. A pointer into one array can never move into another array. – BoP Dec 18 '22 at 20:01
  • 2
    `static_assert` or `union` doesn't remove UB afaict. you can instead apply a grouped view on top of the raw array. – apple apple Dec 18 '22 at 20:04
  • @NicolBolas That is what I am providing... via a cast. The processing inside is in groups of four. according to the enum 3rd param. The API is SFML for reference and int is actually sf::Vertex. – Oliver Schönrock Dec 18 '22 at 20:04
  • @OliverSchönrock: "*That is what I am providing... via a cast.*" A cast cannot change what you created. That's the entire point of strict-aliasing. – Nicol Bolas Dec 18 '22 at 20:06
  • @appleapple Yes, I thought not.. enough for "working inpractice" but not for removing UB. the "gouped view" sounds interesting. How to? – Oliver Schönrock Dec 18 '22 at 20:06
  • @NicolBolas I understand that. That is why I am asking the question. Are you providing a language-lawyer perspective here? If so that's fine, since I asked for that, and it confirms my suspicions. However. I am finding your answers slightly unhelpful. Is there a way to achieve what I am trying to do without introducing UB? ie a way of processing the array in groups before making the API call? – Oliver Schönrock Dec 18 '22 at 20:09
  • @OliverSchönrock: My point is that it's kind of a question you shouldn't ask. It's code that only makes sense if you think of objects as observers of memory and not... *objects*. If you just do what the API tells you to do, then there's no need to ask the question. "*However. I am finding your answers slightly unhelpful.*" These aren't answers; they're comments. That's why they're in the "comment" section, not the "answer" section. – Nicol Bolas Dec 18 '22 at 20:10
  • @NicolBolas "These aren't answers; they're comments" ..yet your answer looks awfully similar – Oliver Schönrock Dec 18 '22 at 20:11
  • @OliverSchönrock: It also has specification citations, but no commentary that you shouldn't be trying it. – Nicol Bolas Dec 18 '22 at 20:12
  • 1
    Regarding the grouped view: You might want to look at `std::ranges::view::chunk`: https://stackoverflow.com/questions/66928472/c20-how-to-split-range-by-size. Unfortunately C++23, but you can use range-v3 in the meantime – joergbrech Dec 18 '22 at 20:13
  • @OliverSchönrock grouped view can be implemented in many way, the most simple way (other than no wrapper at all) could be just let overloaded `operator[](int n)` return `p+4*n`. if all you need is indexing it. – apple apple Dec 18 '22 at 20:20
  • 1
    @OliverSchönrock or construct an array of `std::span` if you want something more. – apple apple Dec 18 '22 at 20:23
  • @appleapple yes.. something which operates on / contains the raw int[] but constructs / presents as a std::span or something inheriting from std::span is a good idea. Thank you. – Oliver Schönrock Dec 18 '22 at 20:26
  • @joergbrech That's interesting... I still have much to learn about applications of ranges. Can I "write to the underlying range" through the "chunked view"? Does all that have to happen in the `|` composed lines? That may not work in practice? – Oliver Schönrock Dec 18 '22 at 20:35
  • 1
    @OliverSchönrock Yes, you can write to the underlying range through the chunked view. The `|` is just a convenience operator for functional composition. You don't have to use it if you don't need it: https://godbolt.org/z/cYcGMqsa7. I wouldn't know why this should not work in practice. – joergbrech Dec 18 '22 at 20:44
  • @joergbrech That's excellent! I like it and didn't know that. Seems like gcc trunk already has this part of c++23. – Oliver Schönrock Dec 18 '22 at 20:49

2 Answers2

2

No amount of static_asserts is going to make something which is categorically UB into well-defined behavior in accord with the standard. You did not create an array of ints; you created a struct containing an array of ints. So that's what you have.

It's legal to convert a pointer to a quad into a pointer to an int[group_size] (though you'll need to alter your code appropriately. Or you could just access the array directly and cast that to an int*.

Regardless of how you get a pointer to the first element, it's legal to do pointer arithmetic within that array. But the moment you try to do pointer arithmetic past the boundaries of the array within that quad object, you achieve undefined behavior. Pointer arithmetic is defined based on the existence of an array: [expr.add]/4

When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P.

  • If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value.
  • Otherwise, if P points to an array element i of an array object x with n elements ([dcl.array]), the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) array element i+j of x if 0≤i+j≤n and the expression P - J points to the (possibly-hypothetical) array element i−j of x if 0≤i−j≤n.
  • Otherwise, the behavior is undefined.

The pointer isn't null, so case 1 doesn't apply. The n above is group_size (because the array is the one within quad), so if the index is > group_size, then case 2 doesn't apply.

Therefore, undefined behavior will happen whenever someone tries to access the array past index 4. There is no cast that can wallpaper over that.


Otherwise, what do I need to do, to make it "not UB". A union? How exactly please?

You don't. What you're trying to do is simply not valid with respect to the C++ object model. You need an array of ints, so you must create an array of ints. You cannot treat an array of something other than ints as an array of ints (well, with minor exceptions of byte-wise arrays, but that's unhelpful to you).


The simplest valid way to process the array in groups is to just... do some nested loops:

int arr[total_number];
for(int* curr = arr; curr != std::end(arr); curr += 4)
{
  //Use `curr[0]` to `curr[3]`;
  //Or create a `std::span<int, 4> group(curr)`;
}
Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • 2
    The `reinterpret_cast` itself pedantically also doesn't work because elements of an array are not pointer-interconvertible with the array. However, the array object itself is pointer-interconvertible with the `quad` object, so `*reinterpret_cast(vals2)` would work. Or really just `vals2[0].val` instead of that weirdness. Not sure whether this should be considered a defect in the standard. – user17732522 Dec 18 '22 at 20:13
  • This answer confirms my sense that my code is UB(1), but offers no solution (ie 2 & 3) – Oliver Schönrock Dec 18 '22 at 20:17
2

No, this is not permitted. The relevant C++ standard section is §7.6.1.10. From the first paragraph, we have (emphasis mine)

The result of the expression reinterpret_­cast<T>(v) is the result of converting the expression v to type T. If T is an lvalue reference type or an rvalue reference to function type, the result is an lvalue; if T is an rvalue reference to object type, the result is an xvalue; otherwise, the result is a prvalue and the lvalue-to-rvalue, array-to-pointer, and function-to-pointer standard conversions are performed on the expression v. Conversions that can be performed explicitly using reinterpret_­cast are listed below. No other conversion can be performed explicitly using reinterpret_­cast.

So unless your use case is listed on that particular page, it's not valid. Most of the sections are not relevant to your use case, but this is the one that comes closest.

An object pointer can be explicitly converted to an object pointer of a different type.[58] When a prvalue v of object pointer type is converted to the object pointer type “pointer to cv T”, the result is static_­cast<cv T*>(static_­cast<cv void*>(v)).

So a reinterpret_cast from one pointer type to another is equivalent to a static_cast through an appropriately cv-qualified void*. Now, a static_cast that goes from T* to S* can be acceptably used as a S* if the types T and S are pointer-interconvertible. From §6.8.4

Two objects a and b are pointer-interconvertible if:

  • they are the same object, or
  • one is a union object and the other is a non-static data member of that object ([class.union]), or
  • one is a standard-layout class object and the other is the first non-static data member of that object or any base class subobject of that object ([class.mem]), or
  • there exists an object c such that a and c are pointer-interconvertible, and c and b are pointer-interconvertible.

If two objects are pointer-interconvertible, then they have the same address, and it is possible to obtain a pointer to one from a pointer to the other via a reinterpret_­cast ([expr.reinterpret.cast]).

[Note 4: An array object and its first element are not pointer-interconvertible, even though they have the same address. — end note]

To summarize, you can cast a pointer to a class C to a pointer to its first member (and back) if there's no vtable to stop you. You can cast a pointer to C into another pointer to C (that can come up if you're adding cv-qualifiers; for instance, reinterpret_cast<const C*>(my_c_ptr) is valid if my_c_ptr is C*). There are also some special rules for unions, which don't apply here. However, you can't factor through arrays, as per Note 4. The conversion you want here is quad[] -> quad -> int -> int[], and you can't convert between the quad[] and the quad. If quad was a simple struct that contained only an int, then you could reinterpret a quad* as an int*, but you can't do it through arrays, and certainly not through a nested layer of them.

None of the sections I've cited say anything about alignment. Or size. Or packing. Or padding. None of that matters. All your static_asserts are doing is slightly increasing the probability that the undefined behavior (which is still undefined) will happen to work on more compilers. But you're using a bandaid to repair a dam; it's not going to work.

Silvio Mayolo
  • 62,821
  • 6
  • 74
  • 116