I’m searching for a fast way to build a union of multiple vectors in C++.
More specifically: I have a collection of vectors (usually 15-20 vector
s with several thousand unsigned integers; always sorted and unique so they could also be an std::set
). For each stage, I choose some (usually 5-10) of them and build a union vector. Than I save the length of the union vector and choose some other vectors. This will be done for several thousand times. In the end I'm only interested in the length of the shortest union vector.
Small example:
V1: {0, 4, 19, 40}
V2: {2, 4, 8, 9, 19}
V3: {0, 1, 2, 4, 40}
V4: {9, 10}
// The Input Vectors V1, V2 … are always sorted and unique (could also be an std::set)
Choose V1 , V3;
Union Vector = {0, 1, 2, 4, 19, 40} -> Size = 6;
Choose V1, V4;
Union Vector = {0,4, 9, 10, 19 ,40} -> Size = 6;
… and so on …
At the moment I use std::set_union
but I’m sure there must be a faster way.
vector< vector<uint64_t>> collection;
vector<uint64_t> chosen;
for(unsigned int i = 0; i<chosen->size(); i++) {
set_union(collection.at(choosen.at(i)).begin(),
collection.at(choosen.at(i)).end(),
unionVector.begin(),
unionVector.end(),
back_inserter(unionVectorTmp));
unionVector.swap(unionVectorTmp);
unionVectorTmp.clear();
}
I'm grateful for every reference.
EDIT 27.04.2017 A new Idea:
unordered_set<unsigned int> unionSet;
unsigned int counter = 0;
for(const auto &sel : selection){
for(const auto &val : sel){
auto r = unionSet.insert(val);
if(r.second){
counter++;
}
}
}