8

We need to maintain mobileNumber and its location in memory. The challenge is that we have more than 5 million of users and storing the location for each user will be like hash map of 5 million records. To resolve this problem, we have to work on ranges

We are given ranges of phone numbers like

  • range1 start="9899123446" end="9912345678" location="a"

  • range2 start="9912345679" end="9999999999" location="b"

A number can belong to single location only.

We need a data structure to store these ranges in the memory.

It has to support two functions

  1. findLocation(Integer number) it should return the location name to which number belongs
  2. changeLocation( Integer Number , String range). It changes location of Number from old location to new location

This is completely in memory design.

I am planning to use tree structure with each node contains ( startofrange , endofrange ,location). I will keep the nodes in sorted order. I have not finalized anything yet. The main problem is-- when 2nd function to change location is called say 9899123448 location to b

The range1 node should split to 3 nodes 1st node (9899123446,9899123447,a) 2nd node (9899123448,9899123448,b) 3rd node (9899123449,9912345678,a).

Please suggest the suitable approach Thanks in advance

user2537119
  • 129
  • 3
  • 9

2 Answers2

10

Normally you can use specialized data structures to store ranges and implement the queries, e.g. Interval Tree.

However, since phone number ranges do not overlap, you can just store the ranges in a standard tree based data structure (Binary Search Tree, AVL Tree, Red-Black Tree, B Tree, would all work) sorted only by [begin].

For findLocation(number), use corresponding tree search algorithm to find the first element that has [begin] value smaller than the number, check its [end] value and verify if the number is in that range. If a match if found, return the location, otherwise the number is not in any range.

For changeLocation() operation:

  1. Find the old node containing the number
  2. If an existing node is found in step 1, delete it and insert new nodes
  3. If no existing node is found, insert a new node and try to merge it with adjacent nodes.

I am assuming you are using the same operation for simply adding new nodes.

More practically, you can store all the entries in a database, build an index on [begin].

Chen Pang
  • 1,370
  • 10
  • 11
  • Plz consider the fact that , while changing location of number i have to make insertion in sorted array. That would be too costly – user2537119 Sep 22 '13 at 21:16
  • Sorry, I only mean you should sort the data by [begin]. You can use a binary search tree or some more advanced data structure like AVL tree, B tree, etc. – Chen Pang Sep 22 '13 at 21:27
  • i will use balanced binary search tress. When Location change operation called . say i have nodeoriginal [10,16,b]. I need to change location of number 13 to a. I will break the nodeoriginal to 3 nodes node1[10,12,b] node2[13,13,a] node3[14,16,b]. After that will replace the nodeoriginal with node2 and its left child is node1 and right child is node3 .Any best way to achieve this process so that tree remain balanced – user2537119 Sep 23 '13 at 05:52
  • 1
    You can simply delete the old code and insert the three new node. The algorithm for balanced binary search tree always ensures it is balanced – Chen Pang Sep 23 '13 at 18:03
  • If phone number from 10 to 14 is for location b, why would the number 13 be a? Does this case really exist? – Chen Pang Sep 23 '13 at 18:08
  • not sure how your `findLocation` algorithm would work. just because a begin value is smaller than the number doesn't mean that's the range it's in. it could be an even smaller range than the one required. for example we could have ranges [1-3] and [3-5], our value is 4, now if we encounter the begin value of 1 then we check the end value, 4 is not in the range [1-3], so by your logic we would say the number is not in any range. but actually we hadn't got to the range [3-5] yet, which is the "correct" answer. – Adam Burley Mar 31 '22 at 11:21
3

First of all range = [begin;end;location]

Use two structures:

  • Sorted array to store ranges begins
  • Hash-table to access ends and locations by begins

Apply following algo:

  1. Use binary search to find "nearest less" value ob begin
  2. Use hash-table to find end and location for begin
k06a
  • 17,755
  • 10
  • 70
  • 110
  • Thanks For suggestion.I will face problem while insertion in array. Say I have range1 = [1,7,a] range2 = [10,16,b] . the sorted array will store [1,10] If i have to change location of 5 from a to b . Now the ranges will be range1 =[1,4,a] range2 =[5,5,b] range3=[6,7,b] range4[10,16,b] Now I have to add the extra begin in array and finally array will become sorted array [1,5,6,10] since insertion is very costly..can not use list also that would effect search... – user2537119 Sep 22 '13 at 21:13
  • @user2537119 just add whole your 5000000 ranges begins to array and then sort it. – k06a Sep 23 '13 at 03:54
  • @user2537119 and then if you will manually add some values to sorted array, this will be not very slow... How fast will you do it? Once per 10 secs? 20 secs? I think insertion time 0.01s is not slow for manually insertion... – k06a Sep 23 '13 at 04:04
  • I can not add whole range. – user2537119 Sep 23 '13 at 05:18
  • To insert in sorted array , i have to shift all the element after inserted element – user2537119 Sep 23 '13 at 05:19
  • @user2537119 add all `begin`s not whole `range`s and sort them once. – k06a Sep 23 '13 at 14:33