2

The question

I have two ranges, call them v,w that are sorted in a given fashion and can be compared (call the order relation T). I want to compare them lexicographically but after sorting them in a different way (call this other order relation S). For this I do not really need the ranges to be completely sorted: I only need to lazily evaluate the elements on the sorted vectors until I find a difference. For example if the maximum of v in this new order is larger than the maximum of w, then I need to only look once in the ordered vectors. In the worst case that v == w I'd look up in all elements.

I understand that C++20 std::ranges::views allows me to get a read only view of v and w that is lazily evaluated. Is it possible to get a custom sorted view that is still lazily evaluated? if I were able to define some pseudocode like

auto v_view_sorted_S = v | std::views::lazily_sort();
auto w_view_sorted_S = w | std::views::lazily_sort();

Then I could simply call std::ranges::lexicographical_compare(v_view_sorted_S, w_view_sorted_S).

How does one implement this?

Would simply calling std::ranges::sort(std::views::all(v)) work? in the sense that will it accept a view instead of an actual range and more importantly evaluate the view lazily? I get from the comments to the reply in this question that with some conditions std::ranges::sort can be applied to views, and even transformed ones. But I suspect that it sorts them at the call time, is that the case?

The case I want it used:

I am interested in any example but the very particular use case that I have is the following. It is irrelevant for the question, but helps putting this in context

The structures v and w are of the form

std::array<std::vector<unsigned int>,N> v;

Where N is a compile-time constant. Moreover, for each 0 <= i < N, v[i] is guaranteed to be non-increasing. The lexicographical order thus obtained for any two ordered arrays is what I called T above.

What I am interested is in comparing them by the following rule: given an entry a = v[i][j] and b = v[k][l] with 0 <= i,k < N and j,l >= 0. Then declare a > b if that relation holds as unsigned integers or if a == b as unsigned integers and i < k.

After ordering all entries of v and w with respect to this order, then I want to compare them lexicographically.

Example, if v = {{2,1,1}, {}, {3,1}}, w = {{2,1,0}, {2}, {3,0}} and z = {{2,1,0}, {3}, {2,0}}, then z > w > v.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
Reimundo Heluani
  • 918
  • 9
  • 18
  • How can you lazily evaluate a sort? You'd have to first find the first element (linear time), then find the 2-element initial sorted partition (and discard it, returning the second element), then the 3-element initial sorted partition, and so on. It's much more expensive than just doing the sort up front. It's going to approach _O(N log N)_ for _each_ lazily-evaluated element, so _O(N² log N)_, right? – Useless Aug 18 '21 at 17:53
  • @Useless yes, but the bet here is that one needs to do very few lookups for the remaining max. I mean I am interested in comparing two objects, not keeping the sorted versions. – Reimundo Heluani Aug 18 '21 at 17:56
  • It can be implemented in O(Nlog N) by returning the maximum element and keeping a view of the remaining ones. – Reimundo Heluani Aug 18 '21 at 18:36
  • Sure, but you're keeping a lot of state off to the side in your lazily evaluated range ... probably up to N-1 indices for all the holes in your original array. – Useless Aug 18 '21 at 18:43
  • @Useless indeed, but that would be fine by me, these structures are not too large, but they need to be compared quite often. I was wandering if there was already an STL ready implementation of this, but if there isn't I think I'll just implement my own view with `std::view_interface` – Reimundo Heluani Aug 18 '21 at 19:14
  • I don't understand why the part "or if a == b as unsigned integers and i < k" is necessary. For your example, isn't sorted(v) always {3, 2, 1, 1, 1} no matter whether this condition is added or not? – xskxzr Aug 19 '21 at 02:43
  • @xskxzr sorted(v) keeps information of what was the index of the vector in which the `unsigned int` was, so it would be `{(3,2),(2,0),(1,0),(1,0),(1,2)}` in the example of the question. This is useful to decide for example `z > w` above, because z starts with `{(3,1),...}` and only one application of finding the max differentiate them: you find (3,2) for w, (3,1) for z and you stop immediately. – Reimundo Heluani Aug 19 '21 at 08:44
  • So do you want to lexicographically compare the vector of such pairs (entry, position)? – xskxzr Aug 19 '21 at 14:43
  • @xskxzr yes I want to be able to start from `w` and `v` as in the last example and check that `w>v` without having to convert and sort `w` and `v` before. – Reimundo Heluani Aug 19 '21 at 15:46
  • An "efficient" lazy sort could be done with O(lg n) additional state. A relatively easy way would be to start with a quick sort. You partition, and then lazy the lower partition, and recurse. Eventually the lowest element is found. Then when someone asks for higher than your sort high water mark, you just run the lazy quicksort within that partition. This should take O(lg n) time per element produced; the extra state is the currently "live" partition indexes. – Yakk - Adam Nevraumont Aug 19 '21 at 18:00

0 Answers0