0

I got this problem from a friend

#include <string>
#include <vector>
#include <iostream>

void riddle(std::string input)
{
    auto strings = std::vector<std::string>{};
    strings.push_back(input);
    auto raw = strings[0].c_str();

    strings.emplace_back("dummy");

    std::cout << raw << "\n";
}

int main()
{
    riddle("Hello world of!"); // Why does this print garbage?
    //riddle("Always look at the bright side of life!"); // And why doesn't this?

    std::cin.get();
}

My first observation is that the riddle() function will not produce garbage when the number of words passed into input is more than 3 words. I am still trying to see why it fails for the first case and not for the second case. Anyways thought this was be fun to share.

Wolfy
  • 548
  • 2
  • 9
  • 29
  • 3
    The `emplace_back` call causes a reallocation of the vector. – David G Feb 21 '19 at 03:50
  • 1
    It probably works for the second case and not the first because the first string uses SSO and the second does not. You shouldn't rely on this behavior though. – tkausl Feb 21 '19 at 03:53
  • @0x499602D2 I see so why does the emplace_back() fail for the first case but not the second? – Wolfy Feb 21 '19 at 04:25

2 Answers2

3

This is undefined behavior (UB), meaning that anything can happen, including the code working. It is UB because the emplace_back invalidates all pointers into the objects in the vector. This happens because the vector may be reallocated (which apparently it is).

The first case of UB "doesn't work" because of short string optimization (sso). Due to sso the raw pointer points to the memory directly allocated by the vector, which is lost after reallocation.

The second case of UB "works" because the string text is too long for SSO and resides on an independent memory block. During resize the string object is moved from, moving the ownership of the memory block of the text to the newly created string object. Since the block of memory simply changes ownership, it remains valid after emplace_back.

Michael Veksler
  • 8,217
  • 1
  • 20
  • 33
2

std::string::c_str() :

The pointer returned may be invalidated by further calls to other member functions that modify the object.


std::vector::emplace_back :

If a reallocation happens, all contained elements are modified.


Since there is no way to know whether a vector reallocation is going to happen when calling emplace_back you have to assume that subsequent use of the earlier return value from string::c_str() leads to undefined behavior.

Since undefined behavior is - undefined - anything can happen. Hence, your code may seem to work or it may seem to fail. It's in error either way.

Sid S
  • 6,037
  • 2
  • 18
  • 24