First option: cluster()
and sort_within()
The handwritten double loop by @MadScienceDreams can be written as a cluster()
algorithm of O(N * K)
complexity with N
elements and K
clusters. It repeatedly calls std::partition
(using C++14 style with generic lambdas, easily adaptable to C++1, or even C++98 style by writing your own function objects):
template<class FwdIt, class Equal = std::equal_to<>>
void cluster(FwdIt first, FwdIt last, Equal eq = Equal{})
{
for (auto it = first; it != last; /* increment inside loop */)
it = std::partition(it, last, [=](auto const& elem){
return eq(elem, *it);
});
}
which you call on your input vector<std::pair>
as
cluster(begin(v), end(v), [](auto const& L, auto const& R){
return L.first == R.first;
});
The next algorithm to write is sort_within
which takes two predicates: an equality and a comparison function object, and repeatedly calls std::find_if_not
to find the end of the current range, followed by std::sort
to sort within that range:
template<class RndIt, class Equal = std::equal_to<>, class Compare = std::less<>>
void sort_within(RndIt first, RndIt last, Equal eq = Equal{}, Compare cmp = Compare{})
{
for (auto it = first; it != last; /* increment inside loop */) {
auto next = std::find_if_not(it, last, [=](auto const& elem){
return eq(elem, *it);
});
std::sort(it, next, cmp);
it = next;
}
}
On an already clustered input, you can call it as:
sort_within(begin(v), end(v),
[](auto const& L, auto const& R){ return L.first == R.first; },
[](auto const& L, auto const& R){ return L.second < R.second; }
);
Live Example that shows it for some real data using std::pair<int, int>
.
Second option: user-defined comparison
Even if there is no operator<
defined on A
, you might define it yourself. Here, there are two broad options. First, if A
is hashable, you can define
bool operator<(A const& L, A const& R)
{
return std::hash<A>()(L) < std::hash<A>()(R);
}
and write std::sort(begin(v), end(v))
directly. You will have O(N log N)
calls to std::hash
if you don't want to cache all the unique hash values in a separate storage.
Second, if A
is not hashable, but does have data member getters x()
, y()
and z()
, that uniquely determine equality on A
: you can do
bool operator<(A const& L, A const& R)
{
return std::tie(L.x(), L.y(), L.z()) < std::tie(R.x(), R.y(), R.z());
}
Again you can write std::sort(begin(v), end(v))
directly.