5

In C++03, for std::string class, c_str() and data() methods have a different behavior.

The first returns a null-terminated character array and the management of this last is totally dependent from the implementation. Indeed, c_str() could return a pointer to another pre-allocated buffer, which always contains a null-terminated string, but this is not mandatory. However, the complexity must be constant.

The second one simply returns the pointer to the internal buffer of std::string, which could be null-terminated, or not.

Thus, in C++03, you could guess that a cast operator to const char* is not a good idea. Indeed, the expected behavior, most of the time, is to have a null-terminated C-style string, but, as the implementation of c_str() could vary, there could be an hidden overhead behind the cast operator. In the other case, it could bring confusion if the cast operator returns the same resultat as data().

However, for C++11, c_str() and data() have the same behavior. c_str() returns a pointer to the std::string object internal buffer. A cast operator to const char* is no more ambiguous. Why this method is not present in std::string class in C++11 ?

Thanks !

AntiClimacus
  • 1,380
  • 7
  • 22
  • 1
    Devil's Advocate: Why should it be? – John Dibling Jul 30 '14 at 22:31
  • 5
    Why would we want one? `c_str` or `data` is more readable than an explicit cast and an implicit cast would cause so many problems. – chris Jul 30 '14 at 22:32
  • 1
    Because it is a useful method and not just "sugar code" ? Always call `c_str()` when you need a `const char*` is boring. @chris: what kind of problems ? – AntiClimacus Jul 30 '14 at 22:33
  • Casting operators in general can be a pain in the neck. Code now won't compile, and if it does compile, unwanted behavior can happen, etc... – PaulMcKenzie Jul 30 '14 at 22:33
  • The `char*` you get in **both** C++03 and C++11 is still dependant on the lifetime of the `std::string`. An implicit cast to `char*` would mean implicit creation of potentially dangling pointers. – Drew Dormann Jul 30 '14 at 22:34
  • How could the complexity be constant if you need to allocate a new buffer and fill it with the string plus terminator? – Deduplicator Jul 30 '14 at 22:35
  • @Deduplicator: this is off-topic ; I haven't said that another buffer is allocated. I have said this is implementation specific. – AntiClimacus Jul 30 '14 at 22:36
  • every time I've designed a class with implicit cast operator, I've taken it out again after actually trying to use the class – M.M Jul 30 '14 at 22:36
  • @PaulMcKenzie: So, why not an explicit cast to `const char*` for lvalues? – Deduplicator Jul 30 '14 at 22:37
  • 1
    @PaulMcKenzie: In the Qt library, for example, QByteArray has a cast operator to `const char*`, and I do not see what is the problem with that. – AntiClimacus Jul 30 '14 at 22:39
  • 3
    @AntiClimacus - I am in `Matt McNabb`s neighborhood of thinking. Every class I've seen developed with casting operators winds up doing something unexpected in non-trivial code. http://stackoverflow.com/questions/492061/why-doesnt-stdstring-provide-implicit-conversion-to-char – PaulMcKenzie Jul 30 '14 at 22:48
  • 1
    If you're asking why it doesn't have an explicit cast operator, it wouldn't have any advantage over `c_str()`. If you're asking why it doesn't have an implicit cast operator, see the link given by @PaulMcKenzie. – Mark Ransom Jul 30 '14 at 23:13

2 Answers2

3

Your question is fundamentally about the philosophy behind the design of the string class. I can but opine.

Why should string have a cast operator to const char*? Cast operators are syntactic sugar for other operations, and are truly needed only in unusual circumstances. Really, they are never needed -- you can always accomplish the same goal in another way.

string already does provide the means to interract with old C-style interfaces, via c_str and data. Adding a cast operator in to the mix doesn't add functionality and does add complexity to the class. Moreover, using a cast operator is always semantically murky. In call-site code, a cast such as with static_cast <const char*> is generally expected to be a compile-time operation. By performing this cast through run-time code, you ambiguate your code. It's not as clear. Because the expectations and reality aren't the same, it's much easier to misuse this run-time cast than the compile-time equivalent.

I would argue that there should not be an implicit conversion operator anywhere it's not truly needed; and it isn't here.

John Dibling
  • 99,718
  • 31
  • 186
  • 324
  • 1
    I'd also add Drew's point: dangling pointers are encouraged with an implicit conversion. Some STL areas have been designed in order to avoid common pitfalls. – Marco A. Jul 30 '14 at 22:38
  • Where does the OP say the cast operator shall be implicit? Also, there's an easy way to restrict it to non-temporaries. – Deduplicator Jul 30 '14 at 22:41
  • @Deduplicator: Well, I guess he doesn't. But at the same time he hasn't said it should be explicit. I still don't see the necessity either way. – John Dibling Jul 30 '14 at 22:42
  • @Deduplicator The problem exists with non-temporaries. With two lvalues, you still have one dependant on the lifetime of the other. – Drew Dormann Jul 30 '14 at 22:44
  • 1
    @JohnDibling: Thanks. I understand the STL philosophy, but C++ philosophy is also `allowing a useful feature is more important than preventing every possible misuse of C++`, and all your responses are about misuses, dangling pointers, thread safety, etc. These errors could be done with the pointer returned by `c_str()`, too. Furthermore, if it is not a good idea to use a cast operator, because casts are expected to be at compile-time and not at runtime, when could I use a cast operator ? What could be a good case to use it ? – AntiClimacus Jul 30 '14 at 23:05
  • @AntiClimacus While you can write incorrect code using `c_str()`, **the problem with implicit cast operator is that sometime you won't notice that you got a temporary object if you are not careful enough.** The other problem is that with a class that is much used like `std::string` such a change would also break or change the behavior existing code in some cases. – Phil1970 Aug 02 '17 at 00:54
  • @AntiClimacus The philosophy is also changing with years... Newer features are designed to help write correct code but obviously it is almost impossible to fix some major flaws in the language like implicit fallthrough, 0 used as nullptr, declaration/initialization ambiguity... – Phil1970 Aug 02 '17 at 01:04
  • I want to add a note here, maybe just as much of a question: std::string does take const char* as a constructor overload. That way it is possible (or even common) to pass in a string like a regular c string to a function/method that takes a std::string (which will create a temporary object in the calling scope) that may well be violate the runtime of the string in the same way as the const char* operator would have. Doesn't this indicate some kind of hypocrazy in the reasoning here? – Larswad Sep 25 '19 at 12:09
  • @Larswad: The overload taking `const char*` makes a *new* `std::string` containing a *copy* of the `const char*`'s data. Once constructed, there's no further tie to the `const char*`, so there's no risk of "violating the runtime of the string". The converse, getting the `const char*` from the internals of a `std::string` *does* have that problem; the lifetime of the `const char*` is tied to the `std::string`, but APIs accepting `const char*` have no way to know that; having them silently convert it to `const char*` risks them *storing* it beyond the lifetime of the `std::string`. – ShadowRanger Nov 19 '21 at 20:14
1

The main reason for these changes was thread safety and in particular avoidance of invalidating iterators and references. For this to happen required null terminating buffers.

More can be read on the proposal N2534.

101010
  • 41,839
  • 11
  • 94
  • 168