6

I've written some K-nearest-neighbor query methods which build a list of points that are nearest to a given query point. To maintain that list of neighbors, I use the std::priority_queue such that the top element is the farthest neighbor to the query point. This way I know if I should push the new element that is currently being examined (if at a lesser distance than the current farthest neighbor) and can pop() the farthest element when my priority-queue has more than K elements.

So far, all is well. However, when I output the elements, I would like to order them from the closest to the farthest. Currently, I simply pop all the elements from the priority-queue and put them on the output-container (through an iterator), which results in a sequence of points ordered from farthest to closest, so then, I call std::reverse on the output iterator range.

As a simple example, here is a linear-search that uses the priority-queue (obviously, the actual nearest-neighbor query methods I use are far more complicated):

  template <typename DistanceValue,
            typename ForwardIterator,
            typename OutputIterator,
            typename GetDistanceFunction,
            typename CompareFunction>
  inline 
  OutputIterator min_dist_linear_search(ForwardIterator first,
                                        ForwardIterator last,
                                        OutputIterator output_first,
                                        GetDistanceFunction distance,
                                        CompareFunction compare,
                                        std::size_t max_neighbors = 1,
                                        DistanceValue radius = std::numeric_limits<DistanceValue>::infinity()) {
    if(first == last) 
      return output_first;

    typedef std::priority_queue< std::pair<DistanceValue, ForwardIterator>, 
                                 std::vector< std::pair<DistanceValue, ForwardIterator> >,
                                 detail::compare_pair_first<DistanceValue, ForwardIterator, CompareFunction> > PriorityQueue; 

    PriorityQueue output_queue = PriorityQueue(detail::compare_pair_first<DistanceValue, ForwardIterator, CompareFunction>(compare));

    for(; first != last; ++first) {
      DistanceValue d = distance(*first);
      if(!compare(d, radius)) 
        continue;

      output_queue.push(std::pair<DistanceValue, ForwardIterator>(d, first));

      while(output_queue.size() > max_neighbors)
        output_queue.pop();

      if(output_queue.size() == max_neighbors)
        radius = output_queue.top().first;
    };

    OutputIterator it = output_first;
    while( !output_queue.empty() ) {
      *it = *(output_queue.top().second);
      output_queue.pop(); ++it;
    };
    std::reverse(output_first, it);
    return it;
  };

The above is all dandy except for one thing: it requires the output-iterator type to be bidirectional and essentially be pointing to a pre-allocated container. Now, this practice of storing the output in a range prescribed by some output iterator is great and pretty standard too (e.g. std::copy and other STL algorithms are good examples of that). However, in this case I would like to be able to only require a forward output-iterator type, which would make it possible to use back-inserter iterators like those provided for STL containers and iostreams.

So, this boils down to reversing the priority-queue before dumping its content in the output iterator. So, these are the better options I've been able to come up with:

  • Create a std::vector, dump the priority-queue content in it, and dump the elements into the output-iterator using a reverse-iterator on the vector.

  • Replace the std::priority_queue with a sorted container (e.g. std::multimap), and then dump the content into the output-iterator using the appropriate traversal order.

Are there any other reasonable option?

I used to employ a std::multimap in a previous implementation of this algorithm and others, as of my second option above. However, when I switched to std::priority_queue, the performance gain was significant. So, I'd rather not use the second option, as it really seems that using a priority-queue for maintaining the list of neighbors is much better than relying on a sorted array. Btw, I also tried a std::vector that I maintain sorted with std::inplace_merge, which was better than multimap, but didn't match up to the priority-queue.

As for the first option, which is my best option at this point, it just seems wasteful to me to have to do this double transfer of data (queue -> vector -> output). I'm just inclined to think that there must be a simpler way to do this... something that I'm missing..

The first option really isn't that bad in this application (considering the complexity of the algorithm that precedes it), but if there is a trick to avoid this double memory transfer, I'd like to know about it.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
Mikael Persson
  • 18,174
  • 6
  • 36
  • 52

3 Answers3

5

Problem solved!

I'm such an idiot... I knew I was missing something obvious. In this case, the std::sort_heap() function. The reference page even has an example that does exactly what I need (and since the std::priority_queue is just implemented in terms of a random-access container and the heap-functions (pop_heap, push_heap, make_heap) it makes no real difference to use these functions directly in-place of the std::priority_queue class). I don't know how I could have missed that.

Anyways, I hope this helps anyone who had the same problem.

Mikael Persson
  • 18,174
  • 6
  • 36
  • 52
3

why don't you just specify the opposite comparison function in the declaration:

#include <iostream>
#include <queue>
#include <vector>
#include <functional>

int main() {
    std::priority_queue<int, std::vector<int>, std::greater<int> > pq;
    pq.push(1);
    pq.push(10);
    pq.push(15);
    std::cout << pq.top() << std::endl;
}
moooeeeep
  • 31,622
  • 22
  • 98
  • 187
3

One dirty idea, which would nevertheless be guaranteed to work, would be the following:

std::priority_queue<int, std::vector<int>, std::less<int> > queue;
queue.push(3);
queue.push(5);
queue.push(9);
queue.push(2);

// Prints in reverse order.
int* front = const_cast<int*>(&queue.top());
int* back = const_cast<int*>(front + queue.size());
std::sort(front, back);
while (front < back) {
    printf("%i ", *front);
    ++front;
}

It may be noted that the in-place sorting will likely break the queue.

Johan Kotlinski
  • 25,185
  • 9
  • 78
  • 101
  • Dirty indeed.. After checking the standard documentation, it appears that the `std::priority_queue` is indeed guaranteed to be compactly stored (no holes in it, like many b-tree implementations would produce). And, using a `std::vector` of course, storage is contiguous. So, I guess you're right, this "dirty" trick will be guaranteed to work. And I don't care about breaking the queue, I don't need it after that. – Mikael Persson Feb 25 '12 at 00:41