1

So I stumbled across this problem someone send in a Discord gc (irrelevant), and it seemed really intriguing. We have a certain target interval (in this example, [20, 100]) and then some other secondary intervals (in this example [20, 45], [30, 75], [40, 80], [60, 95] and [90, 100]). Each secondary interval has a cost, as shown in the picture (the number next to the line is the cost). Our goal is to cover the target interval with the CHEAPEST possible set of secondary intervals

example

For example, the interval [20, 100] can be covered at minimum cost by choosing the intervals with costs 10 + 20 + 5 + 15 = 50. (Choosing the interval with cost 30 instead of the interval with cost 20 would also achieve the desired result but at a higher total cost.)

What would be the best approach to this?

The first thing I tried was developing a greedy approach algorithm in c++, like shown to other similar problems online But I found trouble in the "cheapest" part. How can I make sure I'm using the cheapest interval/subset of intervals that substitutes the most expensive?

EDIT: the problem actually only requires to print the cost of the intervals needed to cover the target interval, not the intervals themselves

dump34
  • 19
  • 2

2 Answers2

1

Suppose the target interval is from s to e and g(x) represents the minimum cost of covering from s to x (i.e. the answer you want is g(e).

You could consider the sub-problem f(y) of minimum cost to cover the target interval from s to y with the additional constraint that nothing is covered beyond y (i.e. there must be a secondary interval that ends exactly at y).

If you have f(y), you can work out g(e) by computing the minimum of f(y) for all y >= e.

To work out f(y) you consider all intervals (a,y) that end at y and work out the minimum (f(x)+cost of interval (a,y)) for all x in the range a<=x<y.

As stated, this would be O(n^2) as there are n values of f(y) to compute, and each needs O(n) to compute. This could be accelerated by using a better data structure to construct the minimum queries.

Peter de Rivaz
  • 33,126
  • 4
  • 46
  • 75
1

Sort the secondary intervals by endpoint, and discard the ones that don't overlap the target interval.

Now, you will build an array for all the secondary endpoints. For each endpoint, you will have (x,cost), where x is the endpoint and cost is the minimum cost to cover the target up to that endpoint.

For each interval, (start,end), you can easily calculate its entry (end,cost), by adding the interval cost to the cost to cover just past its start point. You can find that cost using binary search on the array that has been built so far. Note that the new cost may invalidate some results at the end of the array that you can remove.

The binary search takes O(log n) time, for a total cost of O(n log n), the same as the initial interval sort.

If you remember the interval that produced each cost in the array, then you can easily calculate the set of covering intervals in O(n) time by working backwards.

Matt Timmermans
  • 53,709
  • 3
  • 46
  • 87
  • I made a small edit to my post, the problem requires only to output the minimum cost of the intervals, not the intervals. Also, a little confused, maybe because English is not my mother tongue. How would i get the "minimum cost to cover the target up to that endpoint", in the (x, cost) array? Or is that process described in the next paragraph? Also, in "You can find that cost", in which cost does "that cost" refer to? Lastly, how do I calculate the entry (for building the array) by adding 2 costs, if 1 of those costs is found using binary search on the array that has been built so far? – dump34 Mar 14 '23 at 17:50