10

I'd like to store a bunch of range items in std::set container.

This data structure should provide fast decision whether a specific input range contained by one of the ranges that the set currently holds, by overloading the comparison of std::set in order use the set::find method to check one of the items in set contain the input range argument.

It should also support range item that represents a single point (start_range == end_range).

Here's my implementation :

#include <iostream>
#include <map>
#include <set>
using std::set;
using std::map;

class range : public std::pair<int,int>
{
public:
    range(int lower, int upper)
    {
        if (upper < lower)
        {
           first = upper;
           second = lower;
        }
        else
        {
           first = lower;
           second = upper;
        }
    }
    range(int val)
    {
        first = second = val;
    }
    bool operator<(range const & b) const
    {
        if (second < b.first)
        {
            return true;
        }
        return false;
    }
};

And here's how I test my data structure:

int main(int argc, const char * argv[])
{
    std::map<int, std::set<range>> n;

    n[1].insert(range(-50,-40));
    n[1].insert(range(40,50));
    n[2].insert(range(-30,-20));
    n[2].insert(range(20,30));
    n[3].insert(range(-20,-10));
    n[3].insert(range(10,20));

    range v[] = {range(-50,-41), range(30,45), range(-45,-45), range(25,25)};
    int j[] = {1,2,3};
    for (int l : j)
    {
        for (range i : v)
        {
            if (n[l].find(i) != n[l].end())
            {
                std::cout << l << "," << i.first << ","  << i.second << " : " 
                          << n[l].find(range(i))->first  << " "
                          << n[l].find(range(i))->second << std::endl;
            }
        }
    }
}

and here are the results I get:

1,-50,-41 : -50 -40 --> good 
1,30,45 : 40 50     --> bad
1,-45,-45 : -50 -40 --> good
2,30,45 : 20 30     --> bad
2,25,25 : 20 30     --> good

So as you can see, my code does support perfectly well single point range (-45 is contained by range (-50,-40) and 25 is contained by by range (20,30))

However, as for wider ranges, my current operator < is capable of finding the contained relationship which is equal for the set terminology (meaning that for ranges a and b a<b && a<b.

Is there anyway to change this operator to make it work ?

Irad K
  • 867
  • 6
  • 20

3 Answers3

8

Sounds like a perfect match for using Boost Interval Container Library. In short, you can

#include <boost/icl/interval_set.hpp>

// Helper function template to reduce explicit typing:
template <class T>
auto closed(T&& lower, T&& upper)
{
   return boost::icl::discrete_interval<T>::closed(std::forward<T>(lower),
        std::forward<T>(upper));
}

boost::icl::interval_set<int> ranges;

ranges.insert(closed(1, 2));
ranges.insert(closed(42, 50));

std::cout << contains(ranges, closed(43, 46)) << "\n"; // true
std::cout << contains(ranges, closed(42, 54)) << "\n"; // false

This should easily be pluggable into your std::map and be usable without further adjustments.

lubgr
  • 37,368
  • 3
  • 66
  • 117
  • Hi, Perhaps you can elaborate some more about the definition of `closed` method ? I'm trying to figure out it's purpose to the `interval_set` .. is this the way to define closed range between two object of type T ? – Irad K Mar 06 '19 at 07:14
  • 1
    Sorry for my late response. The `closed` helper function constructs a closed interval, which can be inserted into an `interval_set`. The only issue of this helper method `closed` is to reduce typing of the lengthy `boost::icl::discrete_interval::closed(42, 50)` when constructing intervals. So yes, this is the way to defined closed ranges between two objects of type T. – lubgr Mar 07 '19 at 07:48
7

Your operator < defines partial order: (30,45) < (40, 50) == false and simultaneously (40, 50) < (30, 45) == false so in terms of std::set and std::map they are equal. That is why you got these results.

There is a paper about partial order: https://en.wikipedia.org/wiki/Partially_ordered_set

You might want use std::unordered_map or define somehow total order for your ranges.

I suggest operator < that compares the arithmetical mean of range bounds, i.e. (a, b) < (c, d) if and only if (a+b)/2 < (c+d)/2 for total order. Note that you might want use float for arithmetical mean.

For testing I suggest the following code draft (I write here from scratch and didn't tested it). -1 meanst that are no range that contains this

int range::firstContainsMe(const std::vector<range> rangesVec)
{
    for (size_t i = 0; i < rangesVec; i++) {
        if (lower >= rangesVec[i].lower && upper <= rangesVec[i].upper) {
            return i;
        }
    }
    return -1;
}
Michael Lukin
  • 829
  • 3
  • 9
  • 19
  • Hi and thanks for you response, perhaps you can modify my code to make it work ? thanks – Irad K Mar 05 '19 at 08:42
  • @IradK : Added in answer – Michael Lukin Mar 05 '19 at 08:47
  • @michaellukin, but how does arithmetical mean help me to detect that one range is contained inside the other ? I need this contained relation as well as ordering (which the mean provide). – Irad K Mar 05 '19 at 08:53
  • @Yunnosch The explanation requeres rather long post so I added a link to Wikipedia. – Michael Lukin Mar 05 '19 at 08:54
  • @IradK std::map is not appropriate for your purpose. – Michael Lukin Mar 05 '19 at 09:01
  • 1
    but how does unordered_map fit my purpose ? do you see any container that can help me ? – Irad K Mar 05 '19 at 09:05
  • @IradK When I wrote answer, the pupose of your question haven't appeared. There is no standard container for this problem. I added a naive implementation, it might fit in your time limits. If not, use specific data strucures like interval tree, mentioned in comment to your problem. – Michael Lukin Mar 05 '19 at 09:26
4

Your comparison operator is unsuitable.

If you wish to use any container or algorithm based on ordering in C++, the ordering relation needs to be a Strict Weak Ordering Relation. The definition can be found on Wikipedia, in short the following rules must be respected:

  • Irreflexivity: For all x in S, it is not the case that x < x.
  • Asymmetry: For all x, y in S, if x < y then it is not the case that y < x.
  • Transitivity: For all x, y, z in S, if x < y and y < z then x < z.
  • Transitivity of Incomparability: For all x, y, z in S, if x is incomparable with y (neither x < y nor y < x hold), and y is incomparable with z, then x is incomparable with z.

Your comparison operator fails, and therefore is unsuitable. In general, a quick way of obtaining a good comparison operator is to do what tuples do:

bool operator<(range const & b) const
{
    return std::tie(first, second) < std::tie(b.first, b.second);
}

You want a map, not a set.

In order to solve your problem, you want a map, not a set.

For disjoint intervals, a map from lower-bound to upper-bound is sufficient:

std::map<int, int> intervals;

The .lower_bound and .upper_bound operations allow finding the closest key in O(log N) time, and from there containment is quickly asserted.

For non-disjoint intervals, things get trickier I fear, and you'll want to start looking into specialized data-structures (Interval Trees for example).

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722