1

I have been trying to understand the new ranges library and try to convert some of the more traditional for loops into functional code. The example code given by cppreference is very straight forward and readable. However, I am unsure how to apply Ranges over a vector of Points that needs to have every x and y values looked at, calculated, and compared at the end for which is the greatest distance.

struct Point
{
  double x;
  double y;
}

double ComputeDistance(const Point& p1, const Point& p2)
{
  return std::hypot(p1.x - p2.x, p1.y - p2.y);
}

double GetMaxDistance(const std::vector<Point>& points)
{
  double maxDistance = 0.0;

  for (int i = 0; i < points.size(); ++i)
  {
    for(int j = i; j < points.size(); ++j)
    {
      maxDistance = std::max(maxDistance, ComputeDistance(points.at(i),points.at(j)));
    }
  }
  return maxDistance;
}

GetMaxDistance is the code that I would love to try and clean up and apply ranges on it. Which I thought would be as simple as doing something like:

double GetMaxDistance(const std::vector<Point>& points)
{
  auto result = points | std::views::tranform(ComputeDistance);
  return static_cast<double>(result);
}

And then I realized that was not correct since I am not passing any values into the function. So I thought:

double GetMaxDistance(const std::vector<Point>& points)
{
  for(auto point : points | std::views::transform(ComputeDistance)) 
    // get the max distance somehow and return it?
    // Do I add another for(auto nextPoint : points) here and drop the first item?
}

But then I realized that I am applying that function to every point, but not the point next to it, and this would also not work since I am still only passing in one argument into the function ComputeDistance. And since I need to compute the distance of all points in the vector I have to compare each of the points to each other and do the calculation. Leaving it as an n^2 algorithm. Which I am not trying to beat n^2, I would just like to know if there is a way to make this traditional for loop take on a modern, functional approach.

Which brings us back to the title. How do I apply std::ranges in this case? Is it even possible to do with what the standard has given us at this point? I know more is to be added in C++23. So I don't know if this cannot be achieved until that releases or if this is not possible to do at all.

Thanks!

Sailanarmo
  • 1,139
  • 15
  • 41
  • 1
    Unrelated: You don't need `tempMax`: Just assign directly: `maxDistance = std::max(maxDistance, ComputeDistance(points.at(i),points.at(j)));` - so the following `if` can be removed. – Ted Lyngmo Oct 21 '21 at 16:42
  • Unrelated 2: The `for` loops do a lot of unnecessary comparisons. `for (unsigned i = 0; i < points.size() - 1; ++i)` and `for(unsigned j = i + 1; j < points.size(); ++j) ` would cover all. You also do not need to use the bounds-checking `at()` function. Just use `operator[]`. – Ted Lyngmo Oct 21 '21 at 16:44
  • 1
    Good catch! I'll update that line. – Sailanarmo Oct 21 '21 at 16:44
  • @TedLyngmo Honestly this has been my gut reaction as well. But I can't wrap my head around on what would be a better implementation. I thought about a divide an conquer approach, however, I realized that if I had two separate buckets that I would be missing comparing a max distance between at say i at 1 and i at 5 if they were broken up. Those two could theoretically be the max distance. I also thought about the sliding window algorithm as well. – Sailanarmo Oct 21 '21 at 16:48
  • I don't know if there's a cleaver way to order the points before doing this loop, but if you're going to compare all points, you could do it like I did above to save a few CPU cycles. The `at()` is basically doing `if(index < size())` for every access and shaving of `-1` in the outer and `+1` in the inner save a tiny amount too. – Ted Lyngmo Oct 21 '21 at 16:50
  • 2
    Still unrelated, but you can find the max of the *squares* of the distances, avoiding all those square roots. – Bob__ Oct 21 '21 at 16:51
  • With @Bob__'s idea, it could look [like this](https://godbolt.org/z/36n76n8xe) – Ted Lyngmo Oct 21 '21 at 17:05

1 Answers1

10

The algorithm you're looking for is combinations - but there's no range adaptor for that (neither in C++20 nor range-v3 nor will be in C++23).

However, we can manually construct it in this case using an algorithm usually called flat-map:

inline constexpr auto flat_map = [](auto f){
    return std::views::transform(f) | std::views::join;
};

which we can use as follows:

double GetMaxDistance(const std::vector<Point>& points)
{
    namespace rv = std::views;
    return std::ranges::max(
        rv::iota(0u, points.size())
        | flat_map([&](size_t i){
            return rv::iota(i+1, points.size())
                 | rv::transform([&](size_t j){
                     return ComputeDistance(points[i], points[j]);
                 });
          }));
}

The outer iota is our first loop. And then for each i, we get a sequence from i+1 onwards to get our j. And then for each (i,j) we calculate ComputeDistance.

Or if you want the transform at top level (arguably cleaner):

double GetMaxDistance(const std::vector<Point>& points)
{
    namespace rv = std::views;
    return std::ranges::max(
        rv::iota(0u, points.size())
        | flat_map([&](size_t i){
            return rv::iota(i+1, points.size())
                 | rv::transform([&](size_t j){
                     return std::pair(i, j);
                 });
          })
        | rv::transform([&](auto p){
            return ComputeDistance(points[p.first], points[p.second]);
          }));
}

or even (this version produces a range of pairs of references to Point, to allow a more direct transform):

double GetMaxDistance(const std::vector<Point>& points)
{
    namespace rv = std::views;
    namespace hof = boost::hof;

    return std::ranges::max(
        rv::iota(0u, points.size())
        | flat_map([&](size_t i){
            return rv::iota(i+1, points.size())
                 | rv::transform([&](size_t j){
                     return std::make_pair(
                         std::ref(points[i]),
                         std::ref(points[j]));
                 });
          })
        | rv::transform(hof::unpack(ComputeDistance)));
}

These all basically do the same thing, it's just a question of where and how the ComputeDistance function is called.


C++23 will add cartesian_product and chunk (range-v3 has them now) , and just recently added zip_transform, which also will allow:

double GetMaxDistance(const std::vector<Point>& points)
{
    namespace rv = std::views;
    namespace hof = boost::hof;

    return std::ranges::max(
        rv::zip_transform(
           rv::drop,
           rv::cartesian_product(points, points)
           | rv::chunk(points.size()),
           rv::iota(1))
        | rv::join
        | rv::transform(hof::unpack(ComputeDistance))
    );
}

cartesian_product by itself would give you all pairs - which both includes (x, x) for all x and both (x, y) and (y, x), neither of which you want. When we chunk it by points.size() (produces N ranges of length N), then we repeatedly drop a steadingly increasing (iota(1)) number of elements... so just one from the first chunk (the pair that contains the first element twice) and then two from the second chunk (the (points[1], points[0]) and (points[1], points[1]) elements), etc.

The zip_transform part still produces a range of chunks of pairs of Point, the join reduces it to a range of pairs of Point, which we then need to unpack into ComputeDistance.

This all exists in range-v3 (except zip_transform there is named zip_with). In range-v3 though, you get common_tuple, which Boost.HOF doesn't support, but you can make it work.

Barry
  • 286,269
  • 29
  • 621
  • 977
  • 2
    Please don't scare away beginners. – 康桓瑋 Oct 21 '21 at 17:46
  • @康桓瑋 Would you say that playing with `std::ranges` should be for beginners? I feel the ranges library and functional programming is probably for the more seasoned programmer. I know you meant this as a joke btw haha. I just wanted to ask the question. – Sailanarmo Oct 21 '21 at 18:05
  • @Barry, I was trying your first implementation with my gcc 11.1 compiler. And for some reason it seg faults. I'm not sure why, but it seems to be having issues incrementing `i`. I printed out `i` and `j` and got `(0,1) -> (0,5)`. However, on the next iteration I get `(140737243246664,2)`. Of course the big number changes each time but I am not sure why it knows how to increment `j` and not `i` properly. `std::views::iota` should increment `0u` by 1 each time. – Sailanarmo Oct 21 '21 at 18:40
  • @Sailanarmo [godbolt gcc 11.1](https://godbolt.org/z/63KevnWEf) - no problem there – Ted Lyngmo Oct 21 '21 at 19:00
  • @TedLyngmo that is so weird. No problem on Godbolt at all like you said. I copy/pasted that top code straight into my header file and it seg faults in my above comment. But that clearly isn't showing on Godbolt as it runs and outputs correctly. – Sailanarmo Oct 21 '21 at 19:10
  • @Sailanarmo That's odd. I even added `-fsanitize=address`. If you just copy the whole stuff and put in a new file and compile that. Any issues? – Ted Lyngmo Oct 21 '21 at 19:12
  • @TedLyngmo yup, I moved everything over to a single cpp file. Pretty much copy pasted the godbolt stuff over. The strange thing is, if I do `g++ -std=c++20 -fsanitize=address` I will get `AddressSanitizer:DEADLYSIGNAL: SEGV on unknownn address` however, if I pass in all the arguments provided in the godbolt link the program will run, but it will not output a single thing. – Sailanarmo Oct 21 '21 at 19:35
  • I lied, it outputs 5 correctly if I pass in all the arguments? What the heck?! – Sailanarmo Oct 21 '21 at 19:36
  • @TedLyngmo [godbolt gcc 11.1 with only -fsantize=address on](https://godbolt.org/z/EPG8E69f3) – Sailanarmo Oct 21 '21 at 19:38
  • @Sailanarmo Interesting. Without optimization the first pair becomes `140731754530400` `1` instead of `0` `1`. Scary. – Ted Lyngmo Oct 21 '21 at 19:57
  • @TedLyngmo is this a bug in the compiler? Is this something worth submitting to GCC? – Sailanarmo Oct 21 '21 at 20:15
  • @Sailanarmo It sure feels like a candidate for a bug report. I'm not sure if `clang++` has their own implementation of this in their standard library. It could be worth comparing how they deal with it. If `clang++` works with their own lib and/or gcc's etc. – Ted Lyngmo Oct 21 '21 at 20:21
  • 1
    @Sailanarmo This works on 11.2 and fails on 10.3. Probably https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100290. – T.C. Oct 22 '21 at 01:43
  • @T.C. ahhh, so it was already reported as a bug and fixed! Good. I was getting nervous on filing a bug report haha. – Sailanarmo Oct 22 '21 at 01:45
  • @Barry Just curious, why std::pair instead of std::tuple? – Glenn Teitelbaum Oct 22 '21 at 02:49
  • @GlennTeitelbaumG. [Eric Niebler's comment](https://stackoverflow.com/a/59024241/11638718) under the answer may be helpful. – 康桓瑋 Oct 22 '21 at 03:00