1

I'm writing a party system for a game, and I'm getting some very strange behavior out of a std::set std::wstring that contains IP addresses for the party. The error I'm getting is in xtree, usually in clear, but I've also hit it in insert. It's a read access violation at _Head->_Parent, usually says something like "_Head->_Parent was [memory address]."

This problem is very rare, and I've been trying to reproduce it for 2 months with no success. The game is up 24/7, and thousands of parties are created and destroyed each day, and I only get this error 2-3 times a week. Everything that inserts into, erases from, iterates over, or clears this set has a global mutex around it, that's only used for this set. Here's the relevant code:

    Party.h
    
    ```
    private: 
    std::set<std::wstring>      ipList;
    
    public: 
    std::set<std::wstring> getIPList() { return ipList; }
    ```
    
    Party.cpp
    
    ```
    PartyManager::DisbandParty(Party* party){
       SAFE_DELETE(party);
    }
    
    Party::~Party(){
       ipList.clear();
    }
    
    Party::AddPartyMember(){
       getUniquePlayerLock.lock();
       ipList.insert(s2ws(player->GetClientSession()->GetRemoteIP()));
       getUniquePlayerLock.unlock();
    }
    
    Party::LeaveParty(){
       getUniquePlayerLock.lock();
       ipList.erase(s2ws(player->GetClientSession()->GetRemoteIP()));
       getUniquePlayerLock.unlock();
    }
    
    Party::KickMember(){
       getUniquePlayerLock.lock();
       ipList.erase(s2ws(kickedplayer->GetClientSession()->GetRemoteIP()));
       getUniquePlayerLock.unlock();
    }
    
    Party::GetUniquePlayers(){
        std::set<std::wstring>::iterator it;
        std::wstring curIP;
        std::list<CPlayer*> uniquePlayers;
    
        if (!ipList.empty()) {
            for (it = ipList.begin(); it != ipList.end(); it++)
            {
                curIP = *it;
                //std::wcout << "\nChecking IP " << curIP;
    
                for (int i = 0; i < m_byMemberInfoCount; i++)
                {
                    //std::cout << "\n" << i;
                    CPlayer* player = GetPlayer(i);
                    if (player) {
                        if (curIP == s2ws(player->GetClientSession()->GetRemoteIP())) {
                            uniquePlayers.push_back(player);
                            //std::wcout << "\n" << curIP;
                            break;
                        }
                    }
                }
            }
        }
        return uniquePlayers;
    }
    ```
    

I'm all out of ideas here. It has to be some sort of undefined behavior, memory corruption, or something very obvious that I'm missing. Maybe in certain circumstances ipList is being destroyed before the party destructor is called? I'm still fuzzy on when variables declared in header files are considered "out of scope" and thus destroyed. Any help is appreciated.

Edit: With this logic, would the value of player->GetClientSession()->GetRemoteIP() be destroyed until it is repopulated?

Edit2: I've made some changes to check for nulls in the client session and IP as requested. This time, I'm getting an access violation error on accessing the lower bound of the set.

    ```
    template <class _Keyty>
        _Tree_find_result<_Nodeptr> _Find_lower_bound(const _Keyty& _Keyval) const {
            const auto _Scary = _Get_scary();
            _Tree_find_result<_Nodeptr> _Result{{_Scary->_Myhead->_Parent, _Tree_child::_Right}, _Scary->_Myhead}; **ACCESS VIOLATION ERROR IS HERE**
            _Nodeptr _Trynode = _Result._Location._Parent;
            while (!_Trynode->_Isnil) {`enter code here`
                _Result._Location._Parent = _Trynode;
                if (_DEBUG_LT_PRED(_Getcomp(), _Traits::_Kfn(_Trynode->_Myval), _Keyval)) {
                    _Result._Location._Child = _Tree_child::_Right;
                    _Trynode                 = _Trynode->_Right;
                } else {
                    _Result._Location._Child = _Tree_child::_Left;
                    _Result._Bound           = _Trynode;
                    _Trynode                 = _Trynode->_Left;
                }
            }
    
            return _Result;
        }
```

What could I possibly be doing to this set to cause this memory access error? It's a member variable being accessed in a class function, with no out of band destructors being called on it anywhere. I'm totally lost here.

Edit 3: Adding ifdefs for the global mutex below:

globalVariables.h:

#pragma once
#ifndef playerLockDefined
    #include <mutex>
    #define playerLockDefined
extern std::mutex getUniquePlayerLock;
#endif // !1

globalVariables.cpp:

#include "stdafx.h"
#include "globalVariables.h"

std::mutex getUniquePlayerLock;
TDummy
  • 27
  • 3
  • 2
    enable core dumps and look at the stack trace. – tkausl Mar 21 '21 at 03:57
  • I can get a call stack, but it doesn't shed any light on the issue. I've hit the error in xtree from both AddPartyMember() and from the Party destructor. With this logic, would the value of player->GetClientSession()->GetRemoteIP() be destroyed until it is repopulated? – TDummy Mar 21 '21 at 04:01
  • Can `player->GetClientSession()->GetRemoteIP()` ever return `nullptr`? I would consider adding lots and lots of `assert` statements checking program invariants. – Galik Mar 21 '21 at 04:15
  • @Galik I've never SEEN it return nullptr, but I'm starting to suspect that with my logic there, the party destructor is setting it to null in some cases. I've got cout's all over the place, and I am getting parties with no IP's in them disbanded (which is to be expected, I'm manually removing the IP when the player leaves the party). Maybe I can set the IP to a variable and check if that's null before I do anything with it? – TDummy Mar 21 '21 at 04:26
  • First idea: GetClientSession() and GetRemoteIP() both return pointers. Do you need to check those for null? Second: what compiler version and OS do you compile and run on? Maybe you have some standard library ABI problems... this can happen if you try to use a newer compiler than the OS package manager provides (for example, using latest clang on CentOS is bound to break) – schteppe Mar 21 '21 at 09:07
  • @schteppe I'll try checking for nulls in client session and remote IP! – TDummy Mar 21 '21 at 09:46
  • Do you have a multithreaded environment? Did you ensure the existence of this itself for calls into your class? – Secundi Mar 21 '21 at 12:47
  • I'm not sure what you mean @Secundi, are you asking if I've checked that the party itself exists? – TDummy Mar 21 '21 at 21:47
  • @TDummy yes, that's what I meant.. BTW: your set clearing within the destructor is not necessary (though it might help you to better.analyze your issue maybe here). – Secundi Mar 21 '21 at 23:11
  • @Secundi I can add an if statement to see if the party itself exists when SAFE_DELETE is called. Since the party is dynamically allocated, when the virtual destructor is called, all the destructors for the children are also called, correct? If that's the case, wouldn't I be seeing this error every time a party is disbanded, and not just the very rare cases I'm seeing now? The only reason I have the ipList clearing is because there are other maps in the destructor that are also being cleared, but I'm not sure why. It seems arbitrary, which member variables are emptied in this destructor. – TDummy Mar 21 '21 at 23:45
  • @Secundi Most of the errors I get are in the destructor, when I'm clearing the ipList set. I've been unable to noodle my way around why though. – TDummy Mar 22 '21 at 00:01
  • The destruction of class members within a destructor is done in the reverse order of their construction order, i.e. according to their declaration position within the class. Sounds like the actual issue lies somewhere else. Did you try to use a heap checker? Do you have minimal test case scenarios for your class that might be used to reproduce your issue, mabye via a massive fuzzy approach? Other common mistake: violation of the one definition rule, for instance via wrong pre-compiler ifdefs. – Secundi Mar 22 '21 at 07:04
  • I was hoping I could solve this one before I got to the heap checker point. I'm only declaring it within the header, and adding/removing via the code up there. What you see there is literally all of the code in the entire project that manipulates that set. – TDummy Mar 22 '21 at 14:12
  • @Secundi Your ifdef comment got me thinking that maybe I've done something wrong with the global mutex. I'll update the OP with the globals stuff. – TDummy Mar 23 '21 at 08:31
  • Yes, try to check this. A common particular issue here might be things like #ifdef-conditioned existence of class members (for instance _DEBUG) where the preprocessor macros are not unique all over all involved translation units/projects. – Secundi Mar 23 '21 at 09:05
  • Okay! Do you see anything weird about the globalvariables.cpp/h up there? I'm still pretty new to a lot of c++. @Secundi – TDummy Mar 23 '21 at 09:08
  • I wouldn't use a global variable here and I recommend to almost always define the macro right after the ifndef statement in order to prevent subtile recursive effects. But I do not see a direct real issue with your code that could explain your crashes. A further possibility: wrong mutex usage itself (relocking...--> UB). You might further check this in detail. – Secundi Mar 23 '21 at 09:19
  • @Secundi I've added the mutex logic I'm using as well, but I don't think there's anything too crazy there. I just lock the mutex before I modify the set, then unlock it once I'm finished modifying the set. I only make one call to getUniquePlayers in the entire codebase, and I wrap that in mutex.lock and mutex.unlock as well, so it's locked before I iterate. I will say that the error with AddPartyMembers() only started after I began using this mutex, so maybe the problem is in my usage of the mutex? Everything else seems to be solved, but it's hard to say because it occurs so rarely. – TDummy Mar 23 '21 at 10:01
  • @Secundi Think I might have found it. Party.h has the member variable ipList, a std::set. File #2 includes Party.h. File #3 includes File #2, and therefore includes Party.h. File #3 has a std::list also called ipList, used for a different purpose. Could this be my issue? If so, does the code in File #3 have to be called to create this behavior? – TDummy Mar 24 '21 at 00:59
  • @TDummy depends. is this list a global in File3? Hard to see without the full relevant project parts available. But yes, check this too. – Secundi Mar 24 '21 at 08:44
  • It's not a global in File3, but wouldn't File3 reference the one that already exists in Party.h if it includes it? – TDummy Mar 24 '21 at 13:21

0 Answers0