0

I have been having this rare segfault, and while debugging it, I managed to get the following output from addr2line program.

void std::string::_S_copy_chars<__gnu_cxx::__normal_iterator<unsigned char 
const*, std::vector<unsigned char, std::allocator<unsigned char> > > >
(char*, __gnu_cxx::__normal_iterator<unsigned char const*, 
std::vector<unsigned char, std::allocator<unsigned char> > >,
__gnu_cxx::__normal_iterator<unsigned char const*, std::vector<unsigned char, 
std::allocator<unsigned char> > >)
??:?

Since _S_copy_chars() is a private function in std::string, I am obviously not calling it directly. But I am unable to guess which public function is calling it. If I can figure out the public function, I can zero in on the null de-referencing that is causing the segfault.

I am suspecting the following code...

std::string CInProtocolBase::RetrieveStr(std::vector<unsigned 
char>::const_iterator& iter)
{
    unsigned long sizeOfStr;
    const unsigned char& size = *iter;
    memcpy(&sizeOfStr,&size,4);
    sizeOfStr = 
    boost::asio::detail::socket_ops::network_to_host_long(sizeOfStr);
    std::string str(iter+4,iter+4+sizeOfStr); // <= Could this be culprit??
    iter += (4 + sizeOfStr);
    return str;
}

The Other candidate is this:

std::string CInProtocolBase::VectorToStr(const std::vector<unsigned char>& vec)
{
    return std::string(vec.begin(),vec.end());
}
Sharath
  • 1,627
  • 2
  • 18
  • 34
  • Have you tried to catch the crash in a debugger, and look through the function call stack to find out where in your code it happens? – Some programmer dude Dec 12 '17 at 09:18
  • This is happening in production. I can't use a debugger there. The segfault message is the only hint I have. I am trying to figure what part of the code is causing this. The addr2line has pointed me to std::string::_S_copy_chars() function. I need to track down the code which is calling it. – Sharath Dec 12 '17 at 09:23
  • And there's no core-dump you could use with a debug build? Because there's really no way of going up the call stack if you don't actually have a call stack. – Some programmer dude Dec 12 '17 at 09:25
  • Nope, I just have a segfault message. BTW, I updated the query with some extra info. I am suspecting one function that might do this. It is a 4 year old code though. – Sharath Dec 12 '17 at 09:31

1 Answers1

3

With memcpy(&sizeOfStr,&size,4) I see two problems.

The first is that you copy four bytes from a one-byte variable. That is a clear undefined behavior.

The second is that sizeOfStr might be 8 bytes (on 64-bit systems GCC usually have long being 64 bits). This will let part of the variable be uninitialized and therefore be indeterminate, again leading to undefined behavior.

Use normal assignment and let the compiler properly do the conversion for you:

sizeOfStr = size;
Some programmer dude
  • 400,186
  • 35
  • 402
  • 621
  • Can this cause the current segfault situation? How is it running for so many months on this production server before now? Shouldn't it affect from day one? – Sharath Dec 12 '17 at 09:52
  • 2
    @Sharath It can indeed lead to crashes like the one you have. And the problem with undefined behavior is that it might *seem* to work for years and then one day something seemingly unrelated changed and it no longer works. – Some programmer dude Dec 12 '17 at 09:57
  • Thanks, let me make this change and then see if fixes the crash. – Sharath Dec 12 '17 at 10:01
  • "sizeOfStr = size" doesn't work. memcpy is still required. But I changed unsigned long to unsigned int. – Sharath Dec 12 '17 at 11:43
  • @Sharath What do you mean "doesn't work"? Even if you change `sizeOfStr` to `unsigned int` you still have *undefined behavior* since you copy four bytes from a one-byte source. – Some programmer dude Dec 12 '17 at 11:49
  • Please see my code, size is not a one-byte source, it is an unsigned char& to a vector – Sharath Dec 12 '17 at 12:35
  • @Sharath Ah, I see what you mean. But it's kind of wrong and totally misleading. You should have made `size` a *pointer* since that's how you use it. Also, now you don't handle [*endianness*](https://en.wikipedia.org/wiki/Endianness), which could be an issue. – Some programmer dude Dec 12 '17 at 12:38
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/161020/discussion-between-sharath-and-some-programmer-dude). – Sharath Dec 12 '17 at 13:00