5

I have to find the most frequent number (or mode) from a sorted array of ranges. This range consists simply of a start and end number that is inclusive. As an example arr[0] could contain the numbers {0, 3}, which would then mean the numbers {0, 1, 2, 3}. The ranges are sorted by start number, and when start numbers are equal they are sorted by the end number. Both in ascending order.

The array is filled with n of these ranges. I need to find a way to find the number that occurs in the most ranges.

I have thought of a solution that goes through all the numbers once, but this is in O(n * m), where m is the average length of the ranges. This is an horrendously slow algorithm for the size I am dealing with.

I have also looked at some faster solutions such as this, but I can't think of an efficient way of implementing it when you cannot easily find the kth number, because the ranges can be wildly different sizes.

I'm currently trying to implement this in C, but any idea's or links to resources dealing with this are greatly appreciated.

double-beep
  • 5,031
  • 17
  • 33
  • 41
  • Your question title mentions an array of _sorted_ ranges. Are they sorted by start number or something else? – Ian Abbott Apr 20 '21 at 15:40
  • They are sorted by start number, and if start number are the same, then by end numbers. I'll add that info to my post. – user14480191 Apr 20 '21 at 15:50
  • How big can the range be? And how many ranges can be there at max? – AKSingh Apr 20 '21 at 16:09
  • 10^8 ranges. And each range can be between 0 and 2^32 - 1 – user14480191 Apr 20 '21 at 16:29
  • What if several numbers (not necessarily contiguous) are the most frequent? For example, if none of the ranges overlap then all numbers in the set of ranges occur 1 time only. – Ian Abbott Apr 20 '21 at 16:36
  • This is more of a proof of concept, so it would not really matter. But you could say the lowest (or highest) most frequent number. Or some other arbitrary way of choosing. There could also be an assumption that there won't be multiple most frequent numbers. – user14480191 Apr 20 '21 at 17:07

1 Answers1

2
  1. Convert each ranges {a, b} to two events "add 1 at a" and "subtract 1 at b+1".
  2. Sort the events by ascending order of where to add or subtract. subtract events should be come earlier than add events for same position.
  3. Execute the events and check at what point the result of calculation became highest.

For example, if we have 3 ranges {0, 5}, {2, 9}, {4, 7}, each range will be like this:

0123456789
------
  --------
    ----

And sorted events are:

  • add 1 to 0
  • add 1 to 2
  • add 1 to 4
  • subtract 1 to 6
  • subtract 1 to 8
  • subtract 1 to 10

Executing these events, the value becomes the highest 3 after executing the 3rd event and before executing the 4th event. Therefore we can say that the most appering numbers are 4 and 5.

The time complexity is O(N log N) because:

  • O(N) conversion from ranges to events
  • O(N log N) sorting of events
  • O(N) execution of events

The space complexity is O(N) for storing the events.

MikeCAT
  • 73,922
  • 11
  • 45
  • 70
  • Thank you for your response. I don't grasp how this approach would work. Would this require me to store the full range of numbers somewhere as shown by 0,1,2,3,4,5,7,8,9,10, and then update it to 1,1,3,3,5,5,5,7,7,9,9? An example would be greatly appreciated. I am also asking this because our max possible range is [0, INT_MAX], and storing all possible values could take a lot of memory. – user14480191 Apr 20 '21 at 15:15
  • 1
    The question title implies that space complexity needs to be O(1). I like your algorithm though! – Ian Abbott Apr 20 '21 at 15:37