4

Suppose that we wish to keep track of a point of maximum overlap in a set of intervals—a point that has the largest number of intervals in the database overlapping it.

a. Show that there will always be a point of maximum overlap which is an endpoint of one of the segments.

b. Design a data structure that efficiently supports the operations INTERVAL-INSERT, INTERVAL-DELETE, and FIND-POM, which returns a point of maximum overlap. (Hint: Keep a red-black tree of all the endpoints. Associate a value of +1 with each left endpoint, and associate a value of -1 with each right endpoint. Augment each node of the tree with some extra information to maintain the point of maximum overlap.)

this problem is in the book introduction to algorithm. But I have no idea how to solve the second question. if a greater mind has an elegant solution, please share your idea with me! Thanks.

city
  • 2,036
  • 2
  • 32
  • 40
  • 3
    This seems a lot like homework. What have you tried? Where do you think you are stuck? – Rob Neuhaus Feb 08 '13 at 19:55
  • try to look for greedy algorithms – 0x90 Feb 08 '13 at 19:56
  • The question contains a hint. That's the idea you need. – n. m. could be an AI Feb 08 '13 at 21:03
  • @rrenaud, if breaking up all intervals into endpoints, and then use RBT to sort them, how can I know which two endpoints are in the same intervals? – city Feb 08 '13 at 22:45
  • @city do you need to care? play around and see what happens when different sets of intervals all have the same endpoints and you remove some intervals. – Rob Neuhaus Feb 10 '13 at 21:42
  • @rrenaud, I do not understand the meaning of " Associate a value of +1 with each left endpoint, and associate a value of -1 with each right endpoint.". Does it mean that I scan the subtrees of a node, when encounter a children which is start point, plus one, or subtract one? Or the number of left children subtracts the number of right children? thanks a lot! – city Feb 11 '13 at 15:15

1 Answers1

4

quote:http://ripcrixalis.blog.com/2011/02/08/clrs-chapter-14/

Keep a RB-tree of all the endpoints. We insert endpoints one by one as a sweep line scaning from left to right. With each left endpoint e, associate a value p[e] = +1 (increasing the overlap by 1). With each right endpoint e associate a value p[e] = −1 (decreasing the overlap by 1). When multiple endpoints have the same value, insert all the left endpoints with that value before inserting any of the right endpoints with that value.

Here is some intuition. Let e1, e2, . . . , en be the sorted sequence of endpoints corresponding to our intervals. Let s(i, j) denote the sum p[ei] + p[ei+1] + · · · + p[ej] for 1 ≤ i ≤ j ≤ n. We wish to find an i maximizing s(1, i ). Each node x stores three new attributes. We store v[x] = s(l[x], r [x]), the sum of the values of all nodes in x’s subtree. We also store m[x], the maximum value obtained by the expression s(l[x], i) for any i. We store o[x] as the value of i for which m[x] achieves its maximum. For the sentinel, we define v[nil[T]] = m[nil[T]] = 0.

We can compute these attributes in a bottom-up fashion so as to satisfy the requirements of Theorem 14.1:

v[x] = v[left[x]] + p[x] + v[right[x]] ,
m[x] = max{
m[left[x]] (max is in x’s left subtree),
v[left[x]] + p[x] (max is at x),
v[left[x]] + p[x] + m[right[x]] (max is in x’s right subtree). }

Once we understand how to compute m[x], it is straightforward to compute o[x] from the information in x and its two children.

FIND-POM: return the interval whose endpoint is represented by o[root[T]]. Because of how we have deÞned the new attributes, Theorem 14.1 says that each operation runs in O(lg n) time. In fact, FIND-POM takes only O(1) time.

yxc
  • 229
  • 2
  • 5