0

I have some pieces of C++ code that when run on Xcode with Undefined Behaviour Sanitizer turned on reports: "runtime error: store to misaligned address 0x7f8bcc403771 for type 'int', which requires 4 byte alignment".

So I created a small Catch2 test case to reproduce the code I have to check the runtime behaviour on both windows/x64 (MSVC) and Mac(Xcode 11/clang) but everything runs as expected even when compiled with the different types of optimisations (-O2, -O3, -Ofast, etc).

The code in question is (Catch2 test case):

TEST_CASE("misaligned_write", "[demo]") {
    unsigned char *data = (unsigned char*)malloc(20);
    memset(data, 0, 20);

    int *ptr = reinterpret_cast<int*>(&data[1]);
    *ptr = 0x11223344; // undefined behaviour triggered
    CHECK(static_cast<uint8_t>(data[1]) == 0x44);
    CHECK(static_cast<uint8_t>(data[2]) == 0x33);
    CHECK(static_cast<uint8_t>(data[3]) == 0x22);
    CHECK(static_cast<uint8_t>(data[4]) == 0x11);
}

So my question is: is this a undefined behaviour false positive or there's something in the code that could break in the future due to some changes on default compiler flags?

vcarreira
  • 83
  • 3
  • 4
  • 3
    Well `0x7f8bcc403771 % 4 != 0`. So that's why it's complaining. – Rietty Oct 16 '19 at 14:34
  • 3
    Showing a code sample that doesn't behave poorly does not show that there isn't undefined behavior. You cannot disprove undefined behavior by counter example no matter how thorough the counter example it. One of the permissible behaviors of UB is that it can act exactly how you would expect the code whatever that expectation might be (because *any* behavior is permissible). So whatever you expect well behaved code to do, UB may have the same behavior every time you check and still be UB. – François Andrieux Oct 16 '19 at 14:34
  • 2
    `int *ptr = reinterpret_cast(&data[1]); *ptr = 0x11223344;` is UB. There is no `int` at `ptr` so you can't write one to it. This may change in the future (there is a proposal asking for it) but until that gets standardized it's UB. – NathanOliver Oct 16 '19 at 14:35
  • In principal this should work, but you have no guarantee from the standard on the behavior of the program. – NathanOliver Oct 16 '19 at 14:39
  • @NathanOliver do you happen to have a link to that proposal by any chance? – rustyx Oct 16 '19 at 14:46
  • Note that `reinterpret_cast` is only defined if you cast some type `A` to `B` (but don't use it as `B`) and then back to `A`. Other uses are undefined behavior. For instance, It is useful when you need to pass data to a callback function that requires you to cast your data to `void *`. Then, inside the callback you cast it back to your original type. – darcamo Oct 16 '19 at 14:54
  • @rustyx Found it: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0593r3.html – NathanOliver Oct 16 '19 at 14:59
  • Knowledge of undefined behavior is simply, either you know it or you don't know it. You cannot disprove it by a code example that behaves "correctly". For example, I could probably take forever trying to get `char *p = new char [10]; delete p;` to fail, but I know that `delete p;` is undefined behavior. Why do I know this? Because the standard document told us it is. – PaulMcKenzie Oct 16 '19 at 15:00
  • I believe you can avoid the UB by doing: `int int_data = 0x11223344; memcpy(&data[1], &int_data, sizeof int_data);`. Note there is platform specific behavior regarding endianness, and int size; I presume those things don't matter for your use case. – Eljay Oct 16 '19 at 16:54
  • What happens if you use struct packing (`__attribute__((packed))`) instead of `reinterpret_cast`, does that also trigger UBsan? – rustyx Oct 16 '19 at 19:29

1 Answers1

1

malloc() will always return a pointer that is suitably aligned for any possible data type. &data[1] is a pointer to the second byte of that, which is obviously not aligned correctly for any data type that has an alignment requirement > 1. Use &data[0] or &data if you want to have a pointer in the first four bytes of the malloc'ed memory.

TeaRex
  • 11
  • 1