EDIT: To summarize from the comments (before I close the topic):
- the issue has been discussed here previously:
The resolution was that the committee was aware they break code with this.
- I don't know what urgent issue lead the LWG to replace the old version (that was in the standard between C++98 and C++17 ie. for about 20 years) and not did it this way (as it IMHO was done with
std::gets
in C++14 - for a good reason):
Step 1: Only ADD the new templated version expecting a "reference to an array" from which at compile time the number of elements is deduced (which is surely is a protection against buffer overruns). It would have taken effect "from day one".
Step 2: deprecate the version that was in the standard from C++98 to C++17 and wait for community feedback if there use cases valid enough to keep it. Then, maybe remove it one standard later.
I think a valid use case is what I show here: https://godbolt.org/z/nG174vnqP
(extracted from real code but shortened to show the issue only). Contained is also a little demo why I think the old and the new version could well have co-existed. But maybe I'm wrong with that assessment.
What I find currently most annoying is there is no way to resolve the issue in C++20 without stepping into UB-land. Especially as I think the "old" version is still available internally and the new version just forwards to it - which is highly probably as you don't want a separate implementation for each different array length.
With the release of C++20, the operator>>
overload for reading a char array now expects a char(&)[N]
argument instead of a char*
. The original code that compiled correctly since C++98, which looks like the following, will not work any more:
std::size_t sz = 10;
char *cp = new char[sz];
...
std::cin >> std::setw(sz) >> cp;
To correct this, the code can be modified as follows:
std::cin >> std::setw(sz) >> *reinterpret_cast<char(*)[std::numeric_limits<int>::max()]>(cp));
See here: https://godbolt.org/z/svPcT4eao
Additionally, there's an issue with a common implementation of variable-length strings that can silently change without indication at compile time.
To answer the generally asked question in the comments why I don't use std::string
:
In fact, I use std::string
a lot but I also occasionally coach people who work in projects where you don't want to add any unnecessary overhead and some prefer string classes that don't use three pointers when a single one is sufficient. The example is extracted from one of those.
Also I was pointed to this Can't use std::cin with char* or char[] in C++20 answer and yes, it is about the same topic but the even more important information in this answer is in an LWG it points to: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0487r1.html
It makes clear that valid C++17 code doesn't compile any longer but sadly it does't cover the silent change where code valid before isn't valid any longer but will not cause a compile time error:
At least given the
struct vbuf {
std_::size_t sz;
char cbuf[1];
};
technique with over-allocation hasn't been turned into UB by an earlier C++ standard.
In the comment below @n.m. remarked that this was already UB in C. He may be correct (I not checked all the C standard since C89 and I'm relatively sure it was not UB then) but at least it is a common technique (eg. in the buffers for Linux messages, see sndmsg(3P) etc.) and therefore I think it is a safe assumption - at least for the Linux family of compilers - this is well defined ans safe.
But I will not any longer make the claim the new version causes a "silent change" because this of course not applies if we are in UB-land.