I have a problem with corrupted stack in multithreaded application.
There is a class:
class A {
public:
/// some public methods
private:
some references to other objects like:
ClassA& ref;
ClassB& ref2;
...
some fields like:
std::map<std::string, enumClass> ...
std::mutex ...
std::map<std::string, someClass> ...
std::mutex again some mutex
std::map<string, std::pair<ClassB, someEnum>> corrupted_map;
bool isTrue;
};
To be more specific issue appeared as a segmentation fault. And that segfault is caused by operator[] on corrupted_map
. After debug session it also appeared that one of the field of stl tree has been changed without any operation on corrupted_map
. That is why I think it is stack memory corruption. Right leaf of the stl black red tree header points to inaccessible memory.
Further investigation shows that another map operation corrupts corrupted_map
. In addition another problem is that reproduction of the mention issue takes about 30minutes and requires a lot of traffic. (one of the boxtests).
Analysing core dump is pointless, because corruption happened about 1-2minutes before core dump.
The question for you experts is: how to detect origin of that stack memory corruption? another tools?
I tried with:
ASAN address sanitizer - nothing detected until segfault
GDB - too slow, application is killed before reproduction, a lot of watchdogs, time dependency etc
valgrind - also too slow / and unit tests - nothing detected
static code analyzers - nothing detected
TSAN - thread sanitizer - fixed some detected issues and did not help
I found place which corrupts map with additiona thread that scans stl tree fields every 2ms + additional checks for suspicious methods but well, probably that map operations which is causing mentioned issue is also corrupted.