2

I have a function which has an unordered set as a parameter . Since I am using openmp I am converting this unordered set to vector . I use a std::copy for this conversion .

//pseudo code
func( std::unorderedset s1)
begin
    vector v1;
    std::copy(s1.begin,s2.end,std::back_inserter(v1.end());
#openmp scope
    for( i = 0 ; i < v1.size(); i++ )
    {
         //accessing v1(i)
    }
end

However I feel std::copy is a costly operation . So what I think is, if I create a class variable vector and I keep populating this vector as and when I am updating my set , I can completely avoid this std::copy operation . Since the time complexity of push_back operation of a vector is amortized O(1). What do you suggest ?

LogicStuff
  • 19,397
  • 6
  • 54
  • 74
sameer karjatkar
  • 2,017
  • 4
  • 23
  • 43
  • 2
    Also note you can directly construct your vector as `vector v1(s1.begin(), s1.end());` as there is an overload of the constructor that takes two iterators – Cory Kramer Dec 29 '15 at 12:46
  • @CoryKramer But it will again involve member by member copy . I mean will it be faster than std::copy ? – sameer karjatkar Dec 29 '15 at 12:48
  • 2
    If you don't need to use the vector until after the `unorderedset` is filled, I think populating them in parallel would not be useful. It is best to correctly `reserve` the vector before filling it. Your method doesn't do that, but could easily be changed to. I'm pretty sure the version suggested by CoryKramer **does** correctly reserve before filling. – JSF Dec 29 '15 at 12:50
  • Can you justify your claim that `std::copy()` is expensive, other than having a feeling? `std::push_back()` may be O(1), but adding `n` elements (say, in a loop) is `O(n)`, so your reasoning as described here is flawed. – Peter Dec 29 '15 at 13:00
  • @sameerkarjatkar: Getting the data from a `std::unorderd_set` into a `std::vector` is always going to be an O(n) operation. You can improve it by using `reserve`. Note that `std::copy` with a `back_inserter_iterator` is just a `push_back` loop. It's no more expensive that writing the loop yourself. – Blastfurnace Dec 29 '15 at 13:00
  • The relative benefits of filling the two containers in parallel are cache misses on the reread of stale data from the `unorderedset` and overhead of its iterator. Probably trivial, but we can't be sure. The benefits of reserving the vector size after you know the correct size and before you start filling it are ballpark 3X on the vector operations alone, likely 2X on the whole copy operation (which includes non vector work). That is more likely to be the non trivial performance difference. – JSF Dec 29 '15 at 15:04
  • Yep, creating the vector with the two iterators is probably your best bet here. Not to mention it looks cleaner. Iterating over an unordered_set is going to ultimately be the bottleneck, not the copy operation. At the very least if the vector size is known in advance you're not going to be adding a ton of overhead. Also, any cache benefits here are likely marginal. – alfalfasprout Dec 29 '15 at 16:21
  • Is the copy of each element the costly operation, or merely the copy of the set itself due to the large number of elements? If you are in the first situation (costly element copy), you could consider using a vector of pointers and filling it with pointers to the set elements: `vector v1(s1.size()); std::transform(s1.begin(), s1.end(), v1.begin(), [](T& cur) { return &cur; });` – Javier Martín Dec 30 '15 at 17:25

2 Answers2

7

std::back_insert_iterator calls std::vector::push_back, so your proposal doesn't improve anything.

What is important, is that you know the size v1 will have beforehand, so make use of that information and make std::vector allocate its storage only once to avoid reallocations std::push_back does when v1.size() == v1.capacity().

Do this:

std::vector<T> v1;
v1.reserve(s1.size());
std::copy(s1.begin(), s2.end(), std::back_inserter(v1));

or this:

std::vector<T> v1(s1.size());
std::copy(s1.begin(), s2.end(), v1.begin());

or, as suggested by @CoryKramer, idiomatically constructing v1 from a range:

std::vector<T> v1(s1.begin(), s1.end());

Update:

All three versions do the s1.size() number of copies of T. However, when measured on GCC with 10^7 elements of T = int, it showed up that the std::vector::reserve was the fastest method (twice as fast as range-construction, because of std::distance of ForwardIterators having linear complexity versus std::unordered_set::size having constant). This difference will diminish when dealing with fewer and very large objects, but it'll still exist.

The second way was just slightly slower than the first, because of value-initializing the elements.

Conclusion: use std::vector::reserve.

LogicStuff
  • 19,397
  • 6
  • 54
  • 74
2

You could boost performance a bit using this:

func( std::unorderedset s1)
begin
    vector v1;
    v1.reserve(s1.size()); // HERE
    std::copy(s1.begin,s2.end,std::back_inserter(v1.end());
#openmp scope
    for( i = 0 ; i < v1.size(); i++ )
    {
         //accessing v1(i)
    }
end

However, the cost of copying your object is a problem you have to deal with it

Humam Helfawi
  • 19,566
  • 15
  • 85
  • 160