I find myself needing to return the size of the intersection of two vectors:
std::vector<int> A_, B_
I do not require the intersected values, just the size of the set. This function needs to be called a very large number of times. This is part of a much bigger simulation over a (mathematical) graph/network.
My working conditions are:
- Containers are vectors. To change them is pure pain, but would certainly do so if the gain warrants it.
- The size of A_ and B_ have an upper bound of ~100. But are often much smaller.
- Elements of A_ and B_ represent samples taken from {1,2,...,M}, where M >10,000.
- In general, A_ and B_ have similar, but unequal, sizes.
- Both vectors are unordered.
- The contents of A_ and B_ change, as part of the "bigger simulation".
- Each vector contains only unique elements i.e. no repeats.
My first attempt, using a naive loop, is below. But I think this may not be enough. I've assumed...that std::set_intersection will be too onerous due to repeated sorts and allocations.
int vec_intersect(const std::vector<int>& A_, const std::vector<int>& B_) {
int c_count=0;
for(std::vector<int>::const_iterator it = A_.begin(); it != A_.end(); ++it){
for(std::vector<int>::const_iterator itb = B_.begin(); itb != B_.end(); ++itb){
if(*it==*itb) ++c_count;
}
}
return c_count;
}
Given my conditions above, how else can I implement this to gain speed, relatively easily? Should I be thinking about hash tables or going with sorts and STL, or different containers?