0

I came across a curious behaviour (and I am sure it is just curious to me and there exists a perfectly valid c++ answer) when playing around with c-strings and std::string. Typically, when I pass a string to a class' constructor, I do something like this:

class Foo {
public:
  Foo(const std::string& bar) bar_(bar) { }
private:
  const std::string& bar_;
};

int main() {
  Foo("Baz");
  return 0;
}

which thus far has worked quite well and I have (perhaps naively?) never question this approach.

Then recently I wanted to implement a simple data containing class which, when stripped to its essential structure, looked like this:

#include <iostream>
#include <string>

class DataContainer {
public:
  DataContainer(const std::string& name, const std::string& description)
  : name_(name), description_(description) {}
  auto getName() const -> std::string { return name_; }
  auto getDescription() const -> std::string { return description_; }
private:
  const std::string& name_;
  const std::string& description_;
};

int main() {
    auto dataContainer = DataContainer{"parameterName", "parameterDescription"};
    auto name = dataContainer.getName();
    auto description = dataContainer.getDescription();

    std::cout << "name: " << name.c_str() << std::endl;
    std::cout << "description: " << description.c_str() << std::endl;
}

The output is:

name: parameterName
description:

I use *.c_str() here as this is how I use it my actual codebase (i.e. with google test and EXPECT_STREQ(s1, s2).

When I remove *.c_str() in the main function I get the following output:

name: parameterName
description: tion

So the original string of the description is truncated and the initial string is missing. I was able to fix this by changing the type within the class to:

private:
  const std::string name_;
  const std::string description_;

Now I get the expected output of

name: parameterName
description: parameterDescription

Which is fine, I can use this solution, but I would like to understand what is going on here. Also, if I change my main function slightly to

int main() {
    auto dataContainer = DataContainer{"parameterName", "parameterDescription"};
    auto name = dataContainer.getName().c_str();
    auto description = dataContainer.getDescription().c_str();

    std::cout << "name: " << name << std::endl;
    std::cout << "description: " << description << std::endl;
}

it doesn't matter how I store the string within the DataContainer class, i.e. by const ref or value. In both cases, I get

name: parameterName
description: 

along with a warning on clang:

<source>:19:17: warning: object backing the pointer will be destroyed at the end of the full-expression [-Wdangling-gsl]
    auto name = dataContainer.getName().c_str();

So I guess the issue is arising from *.c_str() itself? However, then I don't quite understand why I can't store the two strings name and description by const ref. Could anyone shed some light on the issue?

tom
  • 361
  • 3
  • 11

3 Answers3

1

In the first issue, where you store const std::string& references as class members, you store dangling references to temporary objects.

When you pass string literals to your constructor, they are not themselves std::string objects, they are const char[] arrays. So, the compiler has to create temporary std::string objects to satisfy the constructor's parameters, which you then store references to. Once the constructor exits, those temporaries get destroyed, leaving your stored references bound to invalid memory.

Your fix to store copies of the std::string objects rather than references to the originals is the correct solution.


In the second issue, where you call c_str() on the return values of getName() and getDescription(), it is a similar issue. You are using dangling pointers to temporary memory.

Those methods are returning std::string objects by value, thus the compiler creates temporary copies of them at the call site. c_str() returns a pointer to the internal data of a std::string object, and you store those pointers to local variables. But then the temporaries get destroyed when they go out of scope, leaving your variables pointing to invalid memory before you have a chance to use them.

You can fix that in one of three ways:

  • by saving copies of the std::string objects to the local variables, rather than saving their internal data pointers. This is what your main() code was originally doing:
auto dataContainer = DataContainer{"parameterName", "parameterDescription"};
auto name = dataContainer.getName(); // <-- auto is deduced as std::string, name is a copy...
auto description = dataContainer.getDescription(); // <-- auto is deduced as std::string, description is a copy...

std::cout << "name: " << name.c_str() << std::endl; // <-- using c_str() pointer is safe here
std::cout << "description: " << description.c_str() << std::endl; // <-- using c_str() pointer is safe here
  • by dropping the local variables altogether and using the c_str() pointers directly in the cout statements before the temporary std::string objects go out of scope:
auto dataContainer = DataContainer{"parameterName", "parameterDescription"};

std::cout << "name: " << dataContainer.getName().c_str() << std::endl; // <-- getName() returns a temp copy, but c_str() is safe to use here
std::cout << "description: " << dataContainer.getDescription().c_str() << std::endl; // <-- getDescription() returns a temp copy, but c_str() is safe to use here
  • by having the methods return references to the std::string class members, rather than returning copies of them:
auto getName() const -> const std::string& { return name_; }
auto getDescription() const -> const std::string& { return description_; }
auto dataContainer = DataContainer{"parameterName", "parameterDescription"};
auto name = dataContainer.getName().c_str(); // <-- no temp is returned here
auto description = dataContainer.getDescription().c_str(); // <-- no temp is returned here

std::cout << "name: " << name << std::endl; // using c_str() pointer is safe here!
std::cout << "description: " << description << std::endl; // <-- using c_str() pointer is safe here!

In this last case, just be sure not to modify the std::string class members before you use the pointers you have saved, otherwise the pointers could become invalidated.

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • Thanks for the additional insights. I have tested these things indeed beforehand as well, but the behaviour is the same (hence the title curious behaviour ...). Your first solution: https://godbolt.org/z/r16xnjqa6 , second solution: https://godbolt.org/z/KaPGrWWqz , third solution: https://godbolt.org/z/TxoMfe3Pe ... they all produce the original output (i.e. the name shows but the description doesn't). Tested on GCC 11.1 and Clang 12.0.0 – tom Jul 23 '21 at 18:35
  • Yup, all three of those godbolt examples are suffering from the same issue - storing dangling references to temporary objects – Remy Lebeau Jul 23 '21 at 21:20
1

As already noted, the problems in the posted code rise from dangling references to temporary objects, either stored as class members or returned and accessed by .c_str().

The first fix is to store actual std::strings as members, not (dangling) references and then write accessor functions returning const references to those:

#include <iostream>
#include <string>

class DataContainer {
public:
  DataContainer(std::string name, std::string description)
    : name_(std::move(name)), description_(std::move(description)) {}
  auto getName() const -> std::string const& { return name_; }
  auto getDescription() const ->  std::string const& { return description_; }
private:
  const std::string name_;
  const std::string description_;
};

int main() {
    auto dataContainer = DataContainer{"parameterName", "parameterDescription"};
    
    std::cout << "name: " << dataContainer.getName().c_str() << std::endl;
    std::cout << "description: " << dataContainer.getDescription().c_str() << std::endl;
    return 0;
}

You can see here that the output is as expected (even when using intermediate local variables).


I use *.c_str() here as this is how I use it my actual codebase

Then consider adding a couple of accessors returning exactly that:

//...
auto Name() const { return name_.c_str(); }
auto Description() const { return description_.c_str(); }
//...
std::cout << "name: " << dataContainer.Name() << std::endl;
std::cout << "description: " << dataContainer.Description() << std::endl;
Bob__
  • 12,361
  • 3
  • 28
  • 42
0

The following is happening: you’re returning the std::string by copy (i.e. a temporary). Then c_str() will return a pointer to data within that temporary, which will be destroyed after the statement. Hence the warning. Return const std::string& instead to get rid of it.

Matthias Grün
  • 1,466
  • 1
  • 7
  • 12