2

I've worked for a while to get my code to a minimal reproducible example and I think I have it. See the single main.cpp function below, compiled (on Linux) one of two ways:

  1. In serial: g++ -O3 --std=c++17 -o test_rho.exe main.cpp
  2. With OpenMP: g++ -O3 --std=c++17 -fopenmp -o test_rho.exe main.cpp

Without OpenMP the code runs fine. With OpenMP, I get the following error:

About to return 0
free(): double free detected in tcache 2

When using OpenMP, the following changes to main.cpp will make this error go away:

  • Don't define Wfn() constructor. More specifically, don't invoke _chi.resize(120).
  • Don't use private _chi in the OpenMP loop.
  • Don't use templating (instead just use double on everything).

What is causing a double free here and why do each of the changes above fix the problem?

main.cpp:

#include <iostream>
#include <vector>

// Wfn class
template<typename real>
class Wfn {
    public:
        Wfn() { _chi.resize(120); }
        std::vector<real> rho(int nrhos);

    private:
        std::vector<real> _chi;
};

// Single Wfn function.
template<typename real>
std::vector<real> Wfn<real>::rho(int nrhos) {
    std::vector<real> rhos;
    rhos.resize(nrhos);
    #pragma omp parallel for private (_chi)
    for (size_t irho = 0; irho < nrhos; irho++) {
        rhos[irho] = (real)irho;
    }
    return rhos;
}

// Main
int main(int argc, char** argv) {
    Wfn<double> wfn;
    std::vector<double> rhovs;

    rhovs = wfn.rho(1000);

    std::cerr << "About to return 0" << std::endl;
    return 0;
}

Update:

In case this helps, the general reason this is happening is because wfn is going out of scope, but my original question(s) still remain(s). If main is changed to:

int main(int argc, char** argv) {
    {   
        Wfn<double> wfn;
        std::vector<double> rhovs;

        rhovs = wfn.rho(1000);
        std::cerr << "About to exit scope" << std::endl;
    }   

    std::cerr << "About to return 0" << std::endl;
    return 0;
}

Then the error is

About to exit scope
free(): double free detected in tcache 2
drjrm3
  • 4,474
  • 10
  • 53
  • 91
  • 1
    I think you have found a bug in g++. No error in clang. I think it is related to `private(_chi)` clause when a private vector is created for each thread, but not properly freed. – Laci Apr 09 '22 at 18:06
  • Yap, I also think this is a bug. If you wrap `_chi` in a helper and print when the constructor and destructor is called, you will see that gcc calls the destructor of `this->_chi` when the function `rho()` returns, for whatever reason. See on [godbolt](https://godbolt.org/z/8h5b9n79x). Seemingly, to trigger the bug one needs a templated class and a member variable needs to be passed via the `private` clause. Curious... – Sedenion Apr 09 '22 at 18:47
  • 1
    Please report an issue to the GCC team so they can be aware of the possible bug and be sure this is actually a bug (and not an undefined behaviour). – Jérôme Richard Apr 09 '22 at 21:08

0 Answers0