7

Suppose we have a vector of pairs:

std::vector<std::pair<A,B>> v;

where for type A only equality is defined:

bool operator==(A const & lhs, A const & rhs) { ... }

How would you sort it that all pairs with the same first element will end up close? To be clear, the output I hope to achieve should be the same as does something like this:

std::unordered_multimap<A,B> m(v.begin(),v.end());
std::copy(m.begin(),m.end(),v.begin());

However I would like, if possible, to:

  • Do the sorting in place.
  • Avoid the need to define a hash function for equality.

Edit: additional concrete information.

In my case the number of elements isn't particularly big (I expect N = 10~1000), though I have to repeat this sorting many times ( ~400) as part of a bigger algorithm, and the datatype known as A is pretty big (it contains among other things an unordered_map with ~20 std::pair<uint32_t,uint32_t> in it, which is the structure preventing me to invent an ordering, and making it hard to build a hash function)

pqnet
  • 6,070
  • 1
  • 30
  • 51
  • `unordered_multimap` pays with space for speedup. Your other solution is the obvious O(n^2) matcher. – Sergey Kalinichenko Aug 15 '14 at 12:07
  • @dasblinkenlight which other solution? I only wrote the one using multimap – pqnet Aug 15 '14 at 12:07
  • 1
    The obvious solution with two nested `for` loops. I assume that it's too slow to be mentioned. – Sergey Kalinichenko Aug 15 '14 at 12:09
  • The [`std::sort`](http://en.cppreference.com/w/cpp/algorithm/sort) function expects an `operator<` for the sorting to work. Fortunately there is an overload that allows you to pass a custom comparator function, which should behave as an `operator<`. Or just make your own `operator<` global function. – Some programmer dude Aug 15 '14 at 12:09
  • @JoachimPileborg but `A` data type does not have such operator, and none can be defined naturally (i.e., there is no easy way to order them except using an hash). – pqnet Aug 15 '14 at 12:11
  • @dasblinkenlight oh well I didn't think about that. It may be indeed the fastest one though. – pqnet Aug 15 '14 at 12:13
  • @pqnet You already have the fastest one - `unordered_multimap` is O(n). – Sergey Kalinichenko Aug 15 '14 at 12:15
  • @dasblinkenlight fastest asymptotically means faster when I have an infinite number of items. With finite number of items it may not be faster if it is more complex, in particular hashing may be not so fast in my case – pqnet Aug 15 '14 at 12:31
  • 2
    "Sort" only makes sense when different elements can be ordered in some what. What you are looking for is grouping or clustering. Why the in-place requirement? (The only real reason to do in-place here is that you have low memory) – IdeaHat Aug 15 '14 at 12:33
  • 2
    @pqnet "fastest asymptotically means faster when I have an infinite number of items" You cannot have an infinite number of items :) All it means that there exists a finite number of items `N` large enough to make the asymptotically faster algo run faster than the asymptotically slower one. I do not know if the number of items that you have is large enough to make unordered_multimap worth it. All I know is that the number of items that you have is large enough to make you want to do the bucketing in place :) – Sergey Kalinichenko Aug 15 '14 at 12:42

4 Answers4

3

if you can come up with a function that assigns to each unique element a unique number, then you can build secondary array with this unique numbers and then sort secondary array and with it primary for example by merge sort.

But in this case you need function that assigns to each unique element a unique number i.e. hash-function without collisions. I think this should not be a problem.

And asymptotic of this solution if hash-function have O(1), then building secondary array is O(N) and sorting it with primary is O(NlogN). And summary O(N + NlogN) = O(N logN). And the bad side of this solution is that it requires double memory.

In conclusion the main sense of this solution is quickly translate your elements to elements which you can quickly compare.

MrPisarik
  • 1,260
  • 1
  • 11
  • 21
  • You can do what you suggest in place you just can't cache the hash values. – IdeaHat Aug 15 '14 at 13:45
  • @pqnet BTW, if sizeof(A) <= 8, then I'd just do a reinterpret cast on the data (with appropriate shifting) and do it MrPisarik's way. – IdeaHat Aug 15 '14 at 13:48
  • Just in practice often 4 bytes much less than size of structure. And allocate array of int not problems:) But if not enough memory then you have to reduce speed as always:) – MrPisarik Aug 15 '14 at 14:02
  • not necessarily, it depends on the complexity of the hash function. A quick hash function calculation can easily take less or equivalent time than a non-cached memory hit (this is under the assumption that in most instances you'll have to read the key anyway to move it). Again, this is one of those things you have to measure. – IdeaHat Aug 15 '14 at 14:08
  • 1
    @MadScienceDreams i can't comment your answer, because have less than 50 point. But I don't understand why in secondary loop condition is 'j < n && i < n-1' when the first loop have 'i < n-2' i.e. 'i < n-1' always performed – MrPisarik Aug 15 '14 at 14:09
  • @MadScienceDreams oh. Finally I understand what you're saying:)) You are right this way can fix the problem with memory and even improve speed all depends on hash-function:) – MrPisarik Aug 15 '14 at 14:21
  • @MrPisark You can advance i++ in the bottom loop so I through it in there without thinking too hard, though I think you may be right that it cannot possibly get to that point (since we enforce i < j < n at the beginning of the loop). – IdeaHat Aug 15 '14 at 15:05
  • @MrPisarik If i could write a (fast) hash function I would be happy and use the original `unordered_multimap` approach I thought of (which in term of raw complexity is unbeatable O(N)). @MadScienceDreams wouldn't hash calculation require to fetch the object(s) from the memory anyway? – pqnet Aug 16 '14 at 06:28
3

First option: cluster() and sort_within()

The handwritten double loop by @MadScienceDreams can be written as a cluster() algorithm of O(N * K) complexity with N elements and K clusters. It repeatedly calls std::partition (using C++14 style with generic lambdas, easily adaptable to C++1, or even C++98 style by writing your own function objects):

template<class FwdIt, class Equal = std::equal_to<>>
void cluster(FwdIt first, FwdIt last, Equal eq = Equal{}) 
{
    for (auto it = first; it != last; /* increment inside loop */)
        it = std::partition(it, last, [=](auto const& elem){
            return eq(elem, *it);    
        });    
}

which you call on your input vector<std::pair> as

cluster(begin(v), end(v), [](auto const& L, auto const& R){
    return L.first == R.first;
});

The next algorithm to write is sort_within which takes two predicates: an equality and a comparison function object, and repeatedly calls std::find_if_not to find the end of the current range, followed by std::sort to sort within that range:

template<class RndIt, class Equal = std::equal_to<>, class Compare = std::less<>>
void sort_within(RndIt first, RndIt last, Equal eq = Equal{}, Compare cmp = Compare{})
{
    for (auto it = first; it != last; /* increment inside loop */) {
        auto next = std::find_if_not(it, last, [=](auto const& elem){
            return eq(elem, *it);
        });
        std::sort(it, next, cmp);
        it = next;
    }
}

On an already clustered input, you can call it as:

sort_within(begin(v), end(v), 
    [](auto const& L, auto const& R){ return L.first == R.first; },
    [](auto const& L, auto const& R){ return L.second < R.second; }
);

Live Example that shows it for some real data using std::pair<int, int>.

Second option: user-defined comparison

Even if there is no operator< defined on A, you might define it yourself. Here, there are two broad options. First, if A is hashable, you can define

bool operator<(A const& L, A const& R)
{
    return std::hash<A>()(L) < std::hash<A>()(R);
}

and write std::sort(begin(v), end(v)) directly. You will have O(N log N) calls to std::hash if you don't want to cache all the unique hash values in a separate storage.

Second, if A is not hashable, but does have data member getters x(), y() and z(), that uniquely determine equality on A: you can do

bool operator<(A const& L, A const& R)
{
    return std::tie(L.x(), L.y(), L.z()) < std::tie(R.x(), R.y(), R.z());
}

Again you can write std::sort(begin(v), end(v)) directly.

TemplateRex
  • 69,038
  • 19
  • 164
  • 304
  • nice and complete answer. In my case the problem is that one of the member of `A` is not easily comparable (it is an `unordered_map` itself, they have `operator ==`, not sure if I can build an hash of them efficiently too) – pqnet Aug 16 '14 at 06:36
  • 1
    @pqnet see [this answer](http://stackoverflow.com/a/21688822/819272) for how to order unordered containers using `lexicographical_compare` – TemplateRex Aug 16 '14 at 07:44
  • Interesting, though the fact that a lexicograpical `operator <` is defined for `map` but not for `unordered_map` makes me think that it there is some reason for that – pqnet Aug 16 '14 at 07:47
  • @pqnet the reason is that the ordering based on hash keys is implementation-defined. For your purposes, that shouldn't matter, but if it does you could use the `cluster` method with `stable_partition` instead – TemplateRex Aug 16 '14 at 07:51
  • the order really doesn't matter, for my purposes it is enough that same key items are close in the final ordering. – pqnet Aug 16 '14 at 08:00
  • @pqnet then hash-based ordering is fine – TemplateRex Aug 16 '14 at 08:01
2

An in place algorithm is

for (int i = 0; i < n-2; i++)
{
   for (int j = i+2; j < n; j++)
   {
      if (v[j].first == v[i].first)
      {
         std::swap(v[j],v[i+1]);
         i++;
      }
 }

There is probably a more elegant way to write the loop, but this is O(n*m), where n is the number of elements and m is the number of keys. So if m is much smaller than n (with a best case being that all the keys are the same), this can be approximated by O(n). Worst case, the number of key ~= n, so this is O(n^2). I have no idea what you expect for the number of keys, so I can't really do the average case, but it is most likely O(n^2) for the average case as well.

For a small number of keys, this may work faster than unordered multimap, but you'll have to measure to find out.

Note: the order of clusters is completely random.

Edit: (much more efficient in the partially-clustered case, doesn't change complexity)

for (int i = 0; i < n-2; i++)
{
   for(;i<n-2 && v[i+1].first==v[i].first; i++){}

   for (int j = i+2; j < n; j++)
   {
      if (v[j].first == v[i].first)
      {
         std::swap(v[j],v[i+1]);
         i++;
      }
 }

Edit 2: At /u/MrPisarik's comment, removed redundant i check in inner loop.

IdeaHat
  • 7,641
  • 1
  • 22
  • 53
  • I would have the inner loop run backwards. In this way you limit the number of swap to O(N) instead of O(N^2) – pqnet Aug 16 '14 at 06:18
2

I'm surprised no one has suggested the use of std::partition yet. It makes the solution nice, elegant, and generic:

template<typename BidirIt, typename BinaryPredicate>
void equivalence_partition(BidirIt first, BidirIt last, BinaryPredicate p) {
  using element_type = typename std::decay<decltype(*first)>::type;

  if(first == last) {
    return;
  }

  auto new_first = std::partition
    (first, last, [=](element_type const &rhs) { return p(*first, rhs); });

  equivalence_partition(new_first, last, p);
}

template<typename BidirIt>
void equivalence_partition(BidirIt first, BidirIt last) {
  using element_type = typename std::decay<decltype(*first)>::type;
  equivalence_partition(first, last, std::equal_to<element_type>());
}

Example here.

Chris Hayden
  • 1,104
  • 6
  • 6
  • Sort on the second element is not a requirement of the OP's post. – Chris Hayden Aug 15 '14 at 18:18
  • +1 because I didn't know about `std::partition`. In this case however I feel like the double loop approach is going to be more straightforward while doing exactly the same thing as `std::partition` – pqnet Aug 16 '14 at 06:20
  • 1
    @pqnet raw loops are an anti-pattern modern c++, it makes code hard to analyze. Try making the partitioning stable e.g., here it just requires the use of `stable_partition` – TemplateRex Aug 16 '14 at 07:47
  • @TemplateRex as a side question, could you find me some reference about the relation between raw loops and code analysis? – pqnet Aug 16 '14 at 07:54
  • 2
    @pqnet see Sean Parent's [slides](https://github.com/sean-parent/sean-parent.github.com/wiki/presentations/2013-09-11-cpp-seasoning/cpp-seasoning.pdf) and [video](http://channel9.msdn.com/Events/GoingNative/2013/Cpp-Seasoning) – TemplateRex Aug 16 '14 at 07:59