2

Hi I have a cross platform C++ app running on Fedora25 which is predictably crashing after a around a day of execution with an error realloc(): invalid next size.

enter image description here

I have narrowed the issue down to on particular pthread which sends out periodic updates to connected clients and empties the outgoing message queue. I am allocating space for char * inside functions called by the thread, which I free up after sending. I do not typically use C++ so I am doing std::string stuff in the backaround an then converting to char * when I need it. I want to make sure I am not missing something simple and any tips on how to re-structure or fix this.

static void* MyPThreadFunc(void * params) {
  assert(params);
    MyAppServer *pAppServer = (MyAppServer *)params;    
    if(pAppServer != NULL) {    
       int loopCounter = 1;
       char* tempBuf;
       int tempBufLen;
       int tempDatSetDelay;
       
       while(true) {
          for(int i=0; i<pAppServer->GetUpdateDataSetCount();i++) {
             tempDatSetDelay = pAppServer->GetDataSetDelay(pAppServer->VecDatSets[i].name);
             if(tempDataSetDelay == 1 ||(tempDataSetDelay > 0 && loopCounter % tempDataSetDelay == 0)) {
                pAppServer->UpdateDataSetMsgStr(pAppServer->VecDataSets[i]);
                tempBuf = (char*)pAppServer->GetDataSetMsgStr(i); //returns const char*
                broadcast(pAppServer->Con,mg_mk_str(tempBuf));
                delete [] tempBuf;
             }//if           
          } //for
          
          //empty outgoing queue
          tempBuf = pAppServer->OUtgoingMsgQueue.peek(tempMsgLen);
          while(tempMsgLen>0) {
             broadcast(pAppServer->Con,mg_mk_str(tempBuf));
             pAppServer->OUtgoingMsgQueue.dequeue();             
             delete [] tempBuf;          
             
             tempBuf = pAppServer->OUtgoingMsgQueue.peek(tempMsgLen);                        
          }
                  
          sleep(1);
          loopCounter = loopCounter==std::numeric_limits<int>::max() ? 1 : ++loopCounter;   
       } //while       
       pAppServer=0;
    }
}

const char* AppServer::GetDataSetMsgStr(const int idx) {
        pthread_mutex_lock(&mLock);
        // Dynamically allocate memory for the returned string
        char* ptr = new char[VecDataSets[idx].curUpdateMsg.size() + 1]; // +1 for terminating NUL

        // Copy source string in dynamically allocated string buffer
        strcpy(ptr, VecDataSets[idx].curUpdateMsg.c_str());

        pthread_mutex_unlock(&mLock);
        // Return the pointer to the dynamically allocated buffer
        return ptr;
}
    
char* MsgQueue::peek(int &len) {
   char* myBuffer  = new char[512];
   len = 0;
   
   pthread_mutex_lock(&mLock);
   if(front==NULL) {
      len = -1;
          pthread_mutex_unlock(&mLock);
      return myBuffer;
     }
     
     len = front->len;
   strncpy(myBuffer,front->chars,len);
   pthread_mutex_unlock(&mLock);
   
   return myBuffer;
}
user3583535
  • 240
  • 2
  • 16
  • I think you should not use any `new[]` or `delete[]` calls and instead use `std::string` throughout the code. Only use `const char *` at the moment a function requires one, and that can be easily achieved by using the `c_str()` member function. You also have this: `tempBuf = pAppServer->OUtgoingMsgQueue.peek(tempMsgLen);`, and there is no indication if that function used `new[]` to allocate the memory (you later call `delete [] tempBuf`). – PaulMcKenzie Jul 06 '20 at 18:44
  • Hi @PaulMcKenzie thanks for your input. I actually do that the peek() function at the bottom of my code snippet. – user3583535 Jul 06 '20 at 18:49
  • In MsgQueue::peek(), what happens if `front->len > 511`? – Stephen M. Webb Jul 06 '20 at 19:01
  • @user3583535 In `peek`, you prematurely allocate memory not knowing if it's even necessary to do so. Then you assume it is only 512 bytes to allocate, and that may not be the case (as pointed out in the previous comments), and last, the allocation is not protected by a mutex, thus is not thread-safe. – PaulMcKenzie Jul 06 '20 at 19:22
  • Is this good for peek(()? char* MsgQueue::peek(int &len) { pthread_mutex_lock(&mLock); len = 0; if(front==NULL) { len = -1; char* emptyBuffer = new char[1]; pthread_mutex_unlock(&mLock); return emptyBuffer; } len = front->len; char* myBuffer = new char[len+1]; strncpy(myBuffer,front->chars,len); pthread_mutex_unlock(&mLock); return myBuffer; } – user3583535 Jul 06 '20 at 20:33
  • You can just return a `nullptr` and not have to allocate any memory if there is no memory to allocate. Calling `delete[]` on a null pointer is perfectly valid. Also, you should post code in the question, not in the comment section – PaulMcKenzie Jul 07 '20 at 00:06

2 Answers2

2

I do not know if this solves your problem, but you have several issues with your peek function:

Issue 1: Prematurely allocating memory.


The first two lines of the function do this, unconditionally:

   char* myBuffer  = new char[512];
   len = 0;

But later on, your function could detect that len == -1, thus not needing to allocate the memory at all. There should be no allocation occurring until you are sure there is a reason to allocate the memory.


Issue 2: Allocating memory outside the mutex.

Related to Issue 1 above, you are allocating memory before the mutex has been established. If two or more threads attempt to call peek, there is a potential for both threads to allocate memory, thus cause a race condition.

All allocations should be done under the mutex, not before it has been set.

pthread_mutex_lock(&mLock);
// now memory can be allocated

Issue 3: Returning allocated memory on error.

Since the peek function returns a pointer to the allocated memory, and later on, the caller calls delete [] on this pointer, it is perfectly valid to return a nullptr and not have to allocate any memory.

In your current peek function, you return allocated memory, even on error. Here is a proposed rewrite of this code:

if(front==NULL) 
{
   len = -1;
   pthread_mutex_unlock(&mLock);
   return nullptr;
}
char* myBuffer = new char[len];
//...

Note that no allocation need to take place until it is certain there is a reason to allocate memory.


Issue 4: Assuming there are 512 bytes to allocate.

What if len is greater than 512? You assumed that the memory to allocate is of length 512, but what if len is greater than 512? Instead you should be using len to determine how many bytes to allocate, not a hard-coded 512.


Issue 5: Not using RAII to control the mutex.

What if in the middle of peek, an exception is thrown (which can happen if new[] is used)? You would have left a mutex locked, and now you have a stalled thread if that thread is waiting for the mutex to be unlocked.

Use RAII to control the locks, where basically a small struct is used where the destructor automatically calls the mutex's unlock function. This way, regardless of the reason why peek returns, the mutex will automatically become unlocked.

In C++ 11 you have std::lock_guard that accomplishes this with the C++ 11 threading model. If for some reason you have to stick to using pthreads, you can create your own RAII wrapper:

struct pthread_lock_guard 
{
   pthread_mutex_lock* plock; 
   pthread_lock_guard(pthread_mutex_lock* p) : plock(p) {}
   ~pthread_lock_guard() { pthread_mutex_unlock(plock); }
};

Then you would use it like this:

char* MsgQueue::peek(int &len) 
{
   pthread_mutex_lock(&mLock);
   pthread_lock_guard lg(&mlock);
   char* myBuffer  = new char[512];
   len = 0;
   
   pthread_mutex_lock(&mLock);
   if(front==NULL) {
      len = -1;
      return myBuffer;
     }
     
     len = front->len;
   strncpy(myBuffer,front->chars,len);
   return myBuffer;
}

Ignoring the other issues pointed out, you see that there are no more calls to unlock the mutex. That is all taken care of when the local lg object is destroyed.


So given all of these issues, here is the final rewrite of the peek function:

struct pthread_lock_guard 
{
   pthread_mutex_lock* plock; 
   pthread_lock_guard(pthread_mutex_lock* p) : plock(p) {}
   ~pthread_lock_guard() { pthread_mutex_unlock(plock); }
};

char* MsgQueue::peek(int &len) 
{
   pthread_mutex_lock(&mLock);
   pthread_lock_guard lg(&mLock);
   if(front==NULL) 
   {
      len = -1;
      return nullptr;
   }
   len = front->len;
   char* myBuffer  = new char[len];
   strncpy(myBuffer,front->chars,len);
   return myBuffer;
}

Note that this has not been compiled, so forgive any compiler errors. Also note that this may not even solve the issues your seeing now. The above is to illustrate all of the potential flaws in the current code you are showing us in the original question that could result in the errors you're seeing.

On a final note, you really should strive to use container classes as much as possible, such as std::vector<char> or std::string, so that you are not using so much raw dynamic-memory handling.

PaulMcKenzie
  • 34,698
  • 4
  • 24
  • 45
  • Unfortunately I'm using C++98 I believe it's called so can't do the RAII . Thanks for your input @PaulMcKenzie !! – user3583535 Jul 07 '20 at 12:42
  • Also, I stated this in the answer -- *If for some reason you have to stick to using pthreads, you can create your own RAII wrapper:* -- that "some reason" includes if you're using C++ 98. – PaulMcKenzie Jul 07 '20 at 12:52
0

I suspect it is because of the memory leak.

Consider the below while loop.

  //empty outgoing queue
  tempBuf = pAppServer->OUtgoingMsgQueue.peek(tempMsgLen);
  while(tempMsgLen>0) {
     broadcast(pAppServer->Con,mg_mk_str(tempBuf));
     pAppServer->OUtgoingMsgQueue.dequeue();             
     delete [] tempBuf;          
     
     tempBuf = pAppServer->OUtgoingMsgQueue.peek(tempMsgLen);                        
  }

There is always a leak of 512 bytes as you are not freeing memory when tempMsgLen<=0. i.e. whenever the loop breaks OR even when your thread does not enter the loop, there is a leak.

As a quick fix, you can just add delete [] tempBuf; after this while loop and give a try.

Or change while loop to do while loop. Make sure that number of calls to peek() is equal to number of delete.

MayurK
  • 1,925
  • 14
  • 27
  • Thanks @MayurK I felt like I should be doing a Do-While loop since I would always need to be running through it at least once, but didn't know if that looped existed in C++. Anyways I still would have had the small leak which maybe causing the heap corruption I am hoping. – user3583535 Jul 06 '20 at 19:08