Using calloc'ed memory as a typed array without UB

Question

I need to construct a large dynamically-sized array of type T, which has an all-zero bit pattern as the representation of the default-constructed value. Actually default-constructing every element like std::vector does (effectively memsetting the memory to zero) incurs a measurable overhead in my application.

I would like to use calloc to make use of OS-zeroed pages instead. I cannot simply cast the returned memory to T*, since no T was ever constructed in that memory, violating the strict-aliasing rule and causing undefined behavior (see this answer).

Using placement-new in calloced memory is correctly optimized away by Clang 9 and gcc-trunk, but not by released versions of GCC or any version of MSVC (see Godbolt).

Is there any way to reliably get rid of the memset here?

Not sure if it really fits the bill, but [`std::launder`](https://en.cppreference.com/w/cpp/utility/launder) might be of help? — Max Langhof, Oct 16 '19 at 07:32
The way I understand it, std::launder only helps with object lifetimes, e.g. when paving over an object containing const attributes with placement new. You still need to pass in a `T*` which I cannot construct without introducing UB. — Fabian Knorr, Oct 16 '19 at 08:11
@Fabian Knorr Is the measurable overhead really that big? If you really want to do this, maybe isolate the logic to one small cpp file, compile that with strict aliasing optimizations disabled, and look at the generated code with Godbolt. You would be in good company, AFAIK, the Linux kernel still disables strict aliasing optimizations, albeit for plain C. — Erik Alapää, Oct 16 '19 at 08:12
@ErikAlapää: The aliasing rules were written with the presumption that the ability of implementations intended for various to recognize constructs that are useful for those purposes would be recognized as a Quality of Implementation issue. The myth that these rules were intended to force programmers to generate slower code is absurd, but seems to be unshakable. — supercat, Oct 26 '19 at 23:59
@supercat: Of course the rules were not intended to force programmers to generate slow code. But sometimes, I want to use C or C++ as a portable assembler, and not have the compiler second guess me. Look at Linux Torvald's strict aliasing debate with the gcc developers ca 10-15 years ago for more context. — Erik Alapää, Oct 27 '19 at 13:08
@ErikAlapää: What is unfortunate about those debates is that Mr. Torvalds blamed the authors of the Standard for allowing nonsensical behavior, rather than recognizing that the Standard regards the ability to process *any* useful programs as a quality-of-implementation issue. Instead of arguing about whether the Standard allows implementations to behave nonsensically, LT should have conceded that it makes no effort to forbid garbage-quality compilers from behaving in garbage fashion. — supercat, Oct 27 '19 at 17:01
@supercat I have used gcc for more than 2 decades, about 2/3 or the time I have programmed professionally in C and C++. I do not consider gcc a garbage-quality compiler. But I would like more provisions in the standard to give back control to the programmer. Good examples are disabling aliasing optimizations, and the 'restrict' keyword in C. Similar constructs as restrict are available for C++ in e.g. gcc. Like I said, sometimes I want to use C/C++ as a portable assembler, and the compiler to translate the C straight to assembler without extensive optimization/pessimization/bugs. — Erik Alapää, Oct 27 '19 at 18:37
@ErikAlapää: The maintainers of gcc take the attitude that if the Standard would allow them to process a piece of code in nonsensical fashion, then the code is "broken", even though the authors of the Standard explicitly stated that they did not wish to demean programs which, while useful, didn't happen to be portable. Further, both gcc and clang have adopted an optimization approach which is dangerous and unsound. Among other things, it leads to situations where a `condition` with no side effects, along with statements `s1` and `s2` will all have defined behavior, but... — supercat, Oct 28 '19 at 06:17
`if (condition) s1 else s2;` will jump the rails. For example, given `extern int x[],y[]; int test(int *p) { int ret=y[0]; if (p == x+1) *p = 2; else {/*empty else*/} return ret+y[0]; }`, even with `-O1 -fno-strict-aliasing`, gcc will conclude that it would be impossible for `*p` to identify `y`, and thus not reload `y` after the write to `*p`. If the compiler were to place `x` and `y` non-consecutively, such a conclusion would be sound, but it is fundamentally unsound if the compiler has no idea how `x` and `y` are placed, since the behavior would be defined if `x` and `y` are consecutive. — supercat, Oct 28 '19 at 06:29
@supercat: For me, it all boils down to being able to turn off 'compiler interference'. Using -O0 and -fno-strict-aliasing is a very blunt tool for that — Erik Alapää, Oct 28 '19 at 08:14
Add a private constructor to T that does nothing, add a friend that constructs an array of T in calloc'd memory using said constructor. — n. m. could be an AI, Oct 28 '19 at 08:20
@ErikAlapää: It's a very blunt tool, but the authors of gcc seem to have no interest in anything better except that, interestingly enough, the `register` keyword helps a lot when using `-O0` and can sometimes result in the compiler generating optimal code even at that setting. — supercat, Oct 28 '19 at 14:50

Using calloc'ed memory as a typed array without UB

0 Answers0