34

I have need of a sort of specialized dictionary. My use case is this: The user wants to specify ranges of values (the range could be a single point as well) and assign a value to a particular range. We then want to perform a lookup using a single value as a key. If this single value occurs within one of the ranges then we will return the value associated to the range.

For example:

// represents the keyed value
struct Interval
{
    public int Min;
    public int Max;
}

// some code elsewhere in the program
var dictionary = new Dictionary<Interval, double>();
dictionary.Add(new Interval { Min = 0, Max = 10 }, 9.0);
var result = dictionary[1];
if (result == 9.0) JumpForJoy();

This is obviously just some code to illustrate what I'm looking for. Does anyone know of an algorithm to implement such a thing? If so could they point me towards it, please?

I have already tried implementing a custom IEqualityComparer object and overloading Equals() and GetHashCode() on Interval but to no avail so far. It may be that I'm doing something wrong though.

Jeffrey Cameron
  • 9,975
  • 10
  • 45
  • 77
  • 1
    You'd have to implement your own custom collection. I don't think you can what you are asking for with the standard Dictionary class. – Nick Jan 27 '10 at 14:22
  • Since your interval bounds are integers, if your domain is sufficiently small and no two intervals overlap, you could just use an array of doubles. In your example, array elements at index 0 to 10 would be set to 9.0. Lookup is then O(1). – Michael Petito Jan 27 '10 at 16:23
  • I would say overriding the `Equals` properly would give you the correct result, but that means you can not have two keys that overlap together in the dictionary – nawfal Apr 06 '13 at 08:09
  • maybe SortedSet? – Menahem Jul 10 '20 at 14:56

9 Answers9

33

A dictionary is not the appropriate data structure for the operations you are describing.

If the intervals are required to never overlap then you can just build a sorted list of intervals and binary search it.

If the intervals can overlap then you have a more difficult problem to solve. To solve that problem efficiently you'll want to build an interval tree:

http://en.wikipedia.org/wiki/Interval_tree

This is a well-known data structure. See "Introduction To Algorithms" or any other decent undergraduate text on data structures.

Timo
  • 2,212
  • 2
  • 25
  • 46
Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • The intervals are not allowed to overlap in my simulation so I'll stick to the SortedList. thanks for the advice Eric! – Jeffrey Cameron Jan 28 '10 at 12:50
  • 2
    While the comment from Jeffrey Cameron is quite old, it is worth nothing the SortedList does not have a fast find nearest key operation at this time. At this time, the List class and List.BinarySearch method are what can provide fast nearest key lookups. – Cameron Oct 25 '15 at 21:15
6

This is only going to work when the intervals don't overlap. And your main problem seems to be converting from a single (key) value to an interval.

I would write a wrapper around SortedList. The SortedList.Keys.IndexOf() would find you an index that can be used to verify if the interval is valid and then use it.

H H
  • 263,252
  • 30
  • 330
  • 514
  • I just tried it by using a standard SortedList with a custom comparer (that checks to see if the intervals intersect or not. It's working well! – Jeffrey Cameron Jan 27 '10 at 15:47
  • 4
    Indeed, if the intervals are required to not overlap then the problem is trivial; you can just binary search a sorted list. If the intervals are allowed to overlap then you have a rather more difficult problem. – Eric Lippert Jan 27 '10 at 16:42
3

This isn't exactly what you want but I think it may be the closest you can expect.

You can of course do better than this (Was I drinking earlier?). But you have to admit it is nice and simple.

var map = new Dictionary<Func<double, bool>, double>()
{
    { d => d >= 0.0 && d <= 10.0, 9.0 }
};

var key = map.Keys.Single(test => test(1.0))
var value = map[key];
ChaosPandion
  • 77,506
  • 18
  • 119
  • 157
1

I have solved a similar problem by ensuring that the collection is contiguous where the intervals never overlap and never have gaps between them. Each interval is defined as a lower boundary and any value lies in that interval if it is equal to or greater than that boundary and less than the lower boundary of the next interval. Anything below the lowest boundary is a special case bin.

This simplifies the problem somewhat. We also then optimized key searches by implementing a binary chop. I can't share the code, unfortunately.

Jeff Yates
  • 61,417
  • 20
  • 137
  • 189
0

I would make a little Interval class, which would something like that:

public class Interval
{
    public int Start {get; set;}
    public int End {get; set;}
    public int Step {get; set;}
    public double Value {get; set;}

    public WriteToDictionary(Dictionary<int, double> dict)
    {
        for(int i = Start; i < End; i += Step)
        {
            dict.Add(i, Value);
        }
    }
}

So you still can a normal lookup within your dictionary. Maybe you should also perform some checks before calling Add() or implement some kind of rollback if any value is already within the dictionary.

Oliver
  • 43,366
  • 8
  • 94
  • 151
0

You can find a Java flavored C# implementation of an interval tree in the Open Geospatial Library. It needs some minor tweaks to solve your problem and it could also really use some C#-ification.

It's Open Source but I don't know under what license.

Jonas Elfström
  • 30,834
  • 6
  • 70
  • 106
0

i adapted some ideas for Dictionary and func, like "ChaosPandion" gave me the idea in his earlier post here above. i still solved the coding, but if i try to refactor
i have a amazing problem/bug/lack of understanding:

Dictionary<Func<string, double, bool>, double> map = new Dictionary<Func<string, double, bool>, double>()
{
        { (a, b) => a == "2018" && b == 4, 815.72},
        { (a, b) => a == "2018" && b == 6, 715.72}
};

What is does is, that i call the map with a search like "2018"(year) and 4(month), which the result is double value 815,72. When i check the unique map entries they look like this:

map working unique keys

so thats the orginal behaviour, anything fine so far. Then i try to refactor it, to this:

Dictionary<Func<string, double, bool>, double> map = 
new Dictionary<Func<string, double, bool>, double>();

WS22(map, values2018, "2018");



private void WS22(Dictionary<Func<string, double, bool>, double> map, double[] valuesByYear, string strYear)
{
          int iMonth = 1;


             // step by step this works:
             map.Add((a, b) => (a == strYear) && (b == 1), dValue);
             map.Add((a, b) => (a == strYear) && (b == 2), dValue);



           // do it more elegant...
           foreach (double dValue in valuesByYear)
           {

             //this does not work: exception after second iteration of foreach run
              map.Add((a, b) => (a == strYear) && (b == iMonth), dValue );
              iMonth+=1; 
          }
}

this works: (i use b==1 and b==2)

this does not work (map not working exception on add item on second iteration)

so i think the problem is, that the map does not have a unique key while adding to map dictionary. The thing is, i dont see my error, why b==1 is working and b==iMonth not.

Thx for any help, that open my eyes :)

0

Using Binary Search, I created an MSTest v2 test case that approaches the solution. It assumes that the index is the actual number you are looking for, which does not (might not?) suit the description given by the OP.

Note that the ranges do not overlap. And that the ranges are

  • [negative infinity, 0)
  • [0, 5]
  • (5, 15]
  • (15, 30]
  • (30, 100]
  • (100, 500]
  • (500, positive infinity]

This values passed as minimumValues are sorted, since they are constants in my domain. If these values can change, the minimumValues list should be sorted again.

Finally, there is a test that uses if statements to get to the same result (which is probably more flexible if you need something else than the index).

[TestClass]
public class RangeUnitTests
{
    [DataTestMethod]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, -1, 0)]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 0, 1)]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 1, 1)]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 5, 1)]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 7, 2)]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 15, 2)]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 16, 3)]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 30, 3)]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 31, 4)]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 100, 4)]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 101, 5)]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 500, 5)]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 501, 6)]
    public void Use_BinarySearch_To_Determine_Range(int[] minimumValues, int inputValue, int expectedRange)
    {
        var list = minimumValues.ToList();
        var index = list.BinarySearch(inputValue);
        if (index < 0)
        {
            index = ~index;
        }

        Assert.AreEqual(expectedRange, index);
    }

    [DataTestMethod]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, -1, 0)]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 0, 1)]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 1, 1)]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 5, 1)]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 7, 2)]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 15, 2)]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 16, 3)]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 30, 3)]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 31, 4)]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 100, 4)]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 101, 5)]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 500, 5)]
    [DataRow(new[] { -1, 5, 15, 30, 100, 500 }, 501, 6)]
    public void Use_Ifs_To_Determine_Range(int[] _, int inputValue, int expectedRange)
    {
        int actualRange = 6;
        if (inputValue < 0)
        {
            actualRange = 0;
        }
        else if (inputValue <= 5)
        {
            actualRange = 1;
        }
        else if (inputValue <= 15)
        {
            actualRange = 2;
        }
        else if (inputValue <= 30)
        {
            actualRange = 3;
        }
        else if (inputValue <= 100)
        {
            actualRange = 4;
        }
        else if (inputValue <= 500)
        {
            actualRange = 5;
        }

        Assert.AreEqual(expectedRange, actualRange);
    }
}

I did a little perfomance testing by duplicating the initial set [DataRow] several times (up to 260 testcases for each method). I did not see a significant difference in performance with these parameteres. Note that I ran each [DataTestMethod] in a seperate session. Hopefully this balances out any start-up costs that the test framework might add to first test that is executed.

Timo
  • 2,212
  • 2
  • 25
  • 46
-2

You could check out the powercollections here found on codeplex that has a collection that can do what you are looking for.

Hope this helps, Best regards, Tom.

t0mm13b
  • 34,087
  • 8
  • 78
  • 110
  • What collection type would that be? – Jørn Schou-Rode Jan 27 '10 at 14:36
  • @Jorn Schou-Rode: MultiDictionary, 'MultiDictionary class that associates values with a key. Unlike an Dictionary, each key can have multiple values associated with it. When indexing an MultiDictionary, instead of a single value associated with a key, you retrieve an enumeration of values.' – t0mm13b Jan 27 '10 at 14:45
  • That makes no sense as the OP wants to map an interval of values to one single value. So the it's the key that needs to consist of multiple values, ether the interval boundaries or all values in that interval. – Frank Bollack Jan 27 '10 at 15:30
  • @Frank: As per the documentation, 'MultiDictionary', and 'When constructed, you can chose to allow the same value to be associated with a key multiple times, or only one time.', and that one of the methods was 'AddMany' which implies 'Adds new values to be associated with a key. If duplicate values are permitted, this method always adds new key-value pairs to the dictionary. If duplicate values are not permitted, and key already has a value equal to one of values associated with it, then that value is replaced, and the number of values associate with key is unchanged.' – t0mm13b Jan 27 '10 at 15:40
  • 1
    A multi dictionary is not an appropriate data structure to represent an interval tree. – Eric Lippert Jan 27 '10 at 16:04