Vector intersection in C++

Question

I have this function

vector<string> instersection(const vector<string> &v1, const vector<string> &v2);

I have two vectors of strings and I want to find the strings that are present in both, which then fills a third vector with the common elemnts.

If my vectors are...

v1 = <"a","b","c">
v2 = <"b","c">

sort() the vectors, and then use a single for loop that browses both vectors simultaniously, always advancing the smaller one. Then just collect the elements in common. — tp1, Oct 20 '13 at 22:37
`for` loop through one vector and inside that do `for` through another. — Agent_L, Oct 20 '13 at 22:39

masoud · Accepted Answer · 2022-05-17T15:42:41.773

60

Try std::set_intersection, for example:

#include <algorithm> //std::sort
#include <iostream> //std::cout
#include <string> //std::string
#include <vector> //std::vector

std::vector<std::string> intersection(std::vector<std::string> v1,
                                      std::vector<std::string> v2){
    std::vector<std::string> v3;

    std::sort(v1.begin(), v1.end());
    std::sort(v2.begin(), v2.end());

    std::set_intersection(v1.begin(),v1.end(),
                          v2.begin(),v2.end(),
                          back_inserter(v3));
    return v3;
}

int main(){
    std::vector<std::string> v1 {"a","b","c"};
    std::vector<std::string> v2 {"b","c"};

    auto v3 = intersection(v1, v2);

    for(std::string n : v3)
        std::cout << n << ' ';
}

edited May 17 '22 at 15:42

answered Oct 20 '13 at 22:43

masoud

55,379
16
141
208

This is O(n log n), where n is the max of the two sizes. Why not just make a hash set consisting of the entries of one of the vectors and then go linearly through the other vector checking for them? That's O(n + m) time, O(m) memory. I can see that the solution I'm proposing is less cache-friendly, in addition to using more memory. – Eric Auld Apr 14 '19 at 19:22
I think OP didn't say vectors are sorted. – laike9m Apr 15 '19 at 20:08

score 7 · Answer 2 · answered Feb 03 '16 at 19:32

7

You need to sort just the smaller vector. Then do a single pass over the bigger vector and test a presence of its items in a smaller vector by using a binary search.

answered Feb 03 '16 at 19:32

Mikhail Volskiy

209
2
5

score 3 · Answer 3 · answered Apr 14 '19 at 19:32

Instead of sorting, consider trading memory for time by making a hash set out of the smaller vector, and then looping over the larger vector checking for those elements, as suggested here. That would be faster than sorting and using std::set_intersection.

Vector intersection in C++

3 Answers3

Linked