1

I have a routine in which I define a bunch of objects (around 20), call them Jet, which have a defined < that I'm using to sort them. After they are sorted, I take the lowest two. What is a fast way to do this? The options I've thought of so far:

  1. boost::ptr_vector<Jet> use the built in .sort(), take the first two,
  2. boost::ptr_list<Jet>, use .sort(), take the first two
  3. use a list as above, but rather than sorting, use max_element, remove the element, and run again.

I assume using std::vector<Jet> would be the worst option because: I don't need random access; the sort will move objects around in memory; and the objects will be copied when calling push_back(Jet). Because of the copying required, I'd also assume that an std::list<Jet> would be worse than boost::ptr_list<Jet>. I'd further assume that taking max_element twice would be faster than sorting the whole list.

Is my logic sound? Is the performance difference going to be significant? Is there some other option I'm not thinking of?

Shep
  • 7,990
  • 8
  • 49
  • 71
  • 1
    Why don't you test it and tell us? (Also, your assumptions are probably wrong, be sure to test those) – Benjamin Lindley Apr 26 '12 at 21:23
  • valid question: because I'm not very experienced at profiling and I thought someone here might be smarter about these things than I am. – Shep Apr 26 '12 at 21:26

3 Answers3

3

One of your assumptions is right, in that storing pointers instead of objects would be faster.

If you only need the two smallest elements however, there's no need to sort anything. Just take the first two elements and then iterate through the elements of the vector or list and retain the smaller ones.

Luchian Grigore
  • 253,575
  • 64
  • 457
  • 625
2

There is a standard library algorithm to do exactly this:

std::vector<Jet> v;
// populate v
std::partial_sort(v.begin(), v.begin() + 2, v.end());

v[0] is now the smallest element and v[1] is now the second smallest element; remaining elements are in an unspecified order. std::vector<> can be substituted with std::deque<>, as the latter also has random-access iterators.

(Note that the complexity mentioned on the page I linked to is incorrect; C++11 §25.4.1.3 says the complexity is (last - first) * log(middle - first).)

ildjarn
  • 62,044
  • 9
  • 127
  • 211
  • I like this answer but O(n log(2)) is more complex than O(n). – Chad Apr 26 '12 at 23:47
  • @Chad : Any O(N log(K)) algorithm is going to be faster than O(N) when K<10. When N==20 as the OP stated, N > N*log(2) (which happens to be ~6). – ildjarn Apr 26 '12 at 23:50
  • 3
    @ildjam (and Chad) Actually, O(N) and O(N log(2)) are identical, because log(2) is a constant. Of course when we're dealing with a problem where 20 is considered a large value for N, algorithmic complexity isn't all that relevant. If performance matters at all, profiling on the target machine is the only thing that's going to give you a relevant answer. So, I'd start with std::partial_sort (because it's simple, well-defined, and already written and tested), and see whether it's fast enough. – abarnert Apr 27 '12 at 00:48
  • @abarnert : Very valid point regarding big-O notation. I just meant that given a concrete N, obviously smaller is better. :-] – ildjarn Apr 27 '12 at 00:51
  • You're misleadingly discrediting other answers (including mine). I agree that this is a better solution, but saying it has better complexity than the others is wrong. – Luchian Grigore Apr 27 '12 at 06:22
  • This is cleaner, which is why I chose it. On the other hand, I was looking for a solution that sorted pointers rather than the objects themselves (which in itself may not be that important, they are only ~80 bytes each). But I get it: there's no clear performance winner without looking at the implementation, the hardware, etc. – Shep Apr 27 '12 at 06:28
  • @Shep : If your container stores pointers rather than objects, there is an overload of `std::partial_sort` that takes a comparison predicate -- pass a predicate that does the appropriate dereferencing and comparison. That being said, if your objects are properly movable, storing pointers is not going to buy you much, if anything. – ildjarn Apr 27 '12 at 16:59
1

If you need lowest two objects, you can make your own search function that will run in O(N) time. Using sort is O(N log N) time.

Daniel
  • 30,896
  • 18
  • 85
  • 139