29

I have this function

vector<string> instersection(const vector<string> &v1, const vector<string> &v2);

I have two vectors of strings and I want to find the strings that are present in both, which then fills a third vector with the common elemnts.

If my vectors are...

v1 = <"a","b","c">
v2 = <"b","c">
Tyler
  • 1,933
  • 5
  • 18
  • 23
  • 1
    sort() the vectors, and then use a single for loop that browses both vectors simultaniously, always advancing the smaller one. Then just collect the elements in common. – tp1 Oct 20 '13 at 22:37
  • `for` loop through one vector and inside that do `for` through another. – Agent_L Oct 20 '13 at 22:39

3 Answers3

60

Try std::set_intersection, for example:

#include <algorithm> //std::sort
#include <iostream> //std::cout
#include <string> //std::string
#include <vector> //std::vector

std::vector<std::string> intersection(std::vector<std::string> v1,
                                      std::vector<std::string> v2){
    std::vector<std::string> v3;

    std::sort(v1.begin(), v1.end());
    std::sort(v2.begin(), v2.end());

    std::set_intersection(v1.begin(),v1.end(),
                          v2.begin(),v2.end(),
                          back_inserter(v3));
    return v3;
}

int main(){
    std::vector<std::string> v1 {"a","b","c"};
    std::vector<std::string> v2 {"b","c"};

    auto v3 = intersection(v1, v2);

    for(std::string n : v3)
        std::cout << n << ' ';
}
masoud
  • 55,379
  • 16
  • 141
  • 208
  • This is O(n log n), where n is the max of the two sizes. Why not just make a hash set consisting of the entries of one of the vectors and then go linearly through the other vector checking for them? That's O(n + m) time, O(m) memory. I can see that the solution I'm proposing is less cache-friendly, in addition to using more memory. – Eric Auld Apr 14 '19 at 19:22
  • I think OP didn't say vectors are sorted. – laike9m Apr 15 '19 at 20:08
7

You need to sort just the smaller vector. Then do a single pass over the bigger vector and test a presence of its items in a smaller vector by using a binary search.

Mikhail Volskiy
  • 209
  • 2
  • 5
3

Instead of sorting, consider trading memory for time by making a hash set out of the smaller vector, and then looping over the larger vector checking for those elements, as suggested here. That would be faster than sorting and using std::set_intersection.

Eric Auld
  • 1,156
  • 2
  • 14
  • 23