Given a data structure based on the map shown below:
std::map<int, std::set<std::vector<int>>> cliques;
Where the key indicates the size of the vectors contained on it.
At the beginning, the map has only one key (e.g. [3]
) which contains the input vectors (e.g. {1, 3, 5}
and {2, 4, 6}
).
My function takes the vectors stored in the biggest key of the map and decompose them into all the possible combinations featuring less elements and store them in the key corresponding to the size of the new vectors (e.g. [2] = {1,3} {1,5} {3,5} {2,4} {2,6} {4,6} and [1] = {1} {3} {5} {2} {4} {6}
) .
I don't know if my solution is the most efficient, but it works pretty well. But since my project is intended to handle a big amount of data, I need to parallelize my code, which led me to the following implementation:
/// Declare data structure
std::map<int, std::set<std::vector<int>>> cliques;
/// insert "input" vectors
cliques[5] = {{1, 3, 5, 7, 9}};
cliques[4] = {{1, 2, 3, 4}};
/// set boundaries
int kMax = 5;
int kMin = 1;
/// enable/disable parallel execution
bool parallelExec = true;
/// "decompose" source vectors:
for (int k = kMax; k > kMin; k--)
{
std::set<std::vector<int>>::iterator it = cliques[k].begin();
#pragma omp parallel num_threads(max_threads) if(parallelExec)
{
for(int s = 0; s < cliques[k].size(); ++s)
{
std::vector<int> clique;
/// maybe should be "omp critical"?
/// maybe "clique" should be private? (or is it already??)
#pragma omp single
{
clique = *it;
}
for (int v = 0; v < clique.size(); ++v)
{
int& vertex = clique[v];
std::vector<int> new_clique;
std::copy_if(clique.begin(), clique.end(), std::back_inserter(new_clique), [vertex](const int& elem) { return elem != vertex; });
int kNew = k - 1;
#pragma omp critical
{
cliques[kNew].insert(new_clique);
}
}
#pragma omp single
{
it++;
}
}
}
}
/// Display results
for(int i = cliques.size(); i > 0 ; i--)
{
auto kSet = cliques[i];
std::cout << "[" << i << "]: ";
for(auto& vec : kSet)
{
std::cout << "{";
for(auto& elem : vec)
{
std::cout << elem << " ";
}
std::cout << "} ";
}
std::cout << std::endl;
}
The use of "omp parallel" and "omp single" (rather than "omp for") allows to access the data structure safely, while allowing all other operations to run in parallel. The code works almost perfectly, almost... since it miss some (very few) sub-vectors in the final results (which are successfully generated if omp is disabled).
Is there any "OMP expert" able to help me with this problem? Thank you in advance.
---------------
UPDATE