6

I am trying to implement redraw regions with up to 3 regions but can't think of an efficient way to find the best set of regions given a set of rectangles.

So there would be a set of rectangles and I would need to calculate up to 3 bounding rectangles that produce the smallest area. bounding rects

The black rectangles are the set of rectangles whereas the red rectangles are the bounding boxes (up to 3) that produce the smallest possible area. Need to work out the best possible combination of bounding boxes.

Louis
  • 4,172
  • 4
  • 45
  • 62
  • 1
    So you're looking for a minimal area cover for an arbitrary set of rectangles whose elements are also rectangles and there are at most three elements in the cover? Maybe have a look at clustering algorithms. – mu is too short Feb 20 '11 at 05:05
  • 1
    Can they overlap? Also, your example does not give the smallest total area *(grouping both the two on the left or the two on the right would give a smaller area)* – BlueRaja - Danny Pflughoeft Feb 20 '11 at 05:08
  • No overlap. Ha yeah, was a quick diagram so not correct. – Louis Feb 20 '11 at 05:26
  • Louis, why is this tagged with the "javascript" tag? – Prestaul Feb 20 '11 at 06:43
  • 1
    For the canvas tag, to implement redraw regions as it is quite slow. But I guess it's not javascript specific. – Louis Feb 20 '11 at 06:58
  • Overlapping bounding "red" rectangles would produce a larger area. He's trying to avoid that. – dave Feb 20 '11 at 07:22
  • It would be helpful to label the "black" rectangles in the example diagram as well... – dave Feb 20 '11 at 09:09
  • 1
    @RPRPORO: Not necessary. Consider the case where the squares were the endpoints on a cross, and we were only allowed two red rectangles. Then having two overlapping rectangles in the shape of a cross will give the lowest overall area, even if you count that area twice. – BlueRaja - Danny Pflughoeft Feb 20 '11 at 22:51
  • @BlueRaja Hi. You are right, I meant what I wrote for this particular case (one to three bounding rectangles). – dave Feb 21 '11 at 03:55
  • 1
    @RPRPORO: No, it's still incorrect; imagine that same scenario as before, but with another square extremely far away from the other four - you would still make an overlapping cross with the first four, and the third bounding rectangle would be around the lone square. – BlueRaja - Danny Pflughoeft Feb 21 '11 at 06:29
  • @BlueRaja: The OP in the comments specifies no overlap. That rules out your example of 2 boxes in the shape of a cross. – btilly Feb 21 '11 at 08:23
  • I assumed overlap would be going backwards but I am starting to think it wouldn't be a problem. The regions only identify objects underlying and then redraw them so I could always weed out duplicates. – Louis Feb 21 '11 at 08:27
  • @Louis Wouldn't weeding out duplicates introduce another cost? – dave Feb 21 '11 at 17:54
  • @BlueRaja The case you originally stated ("black" rectangles as endpoints of "a cross") can also be solved with two "red" bounding rectangles without overlap resulting in the same area as the solution that involves overlapping. The new case you suggest ("extremely far" fifth square) is optimally solved for total area by use of overlapping as you suggest. – dave Feb 21 '11 at 18:13
  • If you're going to allow overlaps, the best algorithm that I can easily think of is O(n^7), but might be improvable to O(n^6). How important is it to always get the best solution? I already gave an O(n^3) algorithm that will give pretty good (though not always best) answers with no overlap. – btilly Feb 21 '11 at 18:16
  • @btilly Overlaps are now allowed as per Louis' latest comments. Fundamentally, producing *consistent*, optimal solutions is paramount to algorithm design. My recommendation to @Louis is still to review literature on *existing* geometric clustering algorithms or to consider applying a genetic or simulated annealing algorithm to this problem. – dave Feb 21 '11 at 19:07

4 Answers4

1

This is a rather straightforward example. The idea is to 'grow' your bounding boxes, much like a MST. I feel the problem is similar to an MST except we have up to 3 disjoint trees, which increases the complexity significantly.

The algorithm takes about (n choose 3)*(3*n) steps, or O(n^4).

  1. Number the rectangles.
  2. Pick any combination of 3 rectangles. For each combination:
    1. Set your three initial bounding boxes to their width/height.
    2. For each remaining rectangle:
      1. Find the area that it would increase the bounding box by if it was added to that box, for all three.
      2. Add it to the box with minimum increase in size (resize that bounding box).

Initially, it might seem this isn't optimum -- the order in which the remaining rectangles are added in step 2.2 affects the bounding box size you get -- but when you pick up a new combination of three rectangles as your starting set it should catch the better configuration.

Vanwaril
  • 7,380
  • 6
  • 33
  • 47
  • This approach is very interesting, even though it doesn't take into account the fact that the number of bounding rectangles, according to the problem statement, can be anywhere from one to three, i.e. there might be a "black" rectangle arrangement where the optimal solution requires that there be only one bounding "red" rectangle. – dave Feb 20 '11 at 08:51
  • Any optimal rectangle arrangement with 1 or 2 rectangles can be an arrangement with 3 rectangles of the same size, so the optimum solution can always be comprised of 3 bounding rectangles. – Vanwaril Feb 20 '11 at 09:14
  • @Vanwaril I see your point, even though creating a bounding rectangle has a cost. From this point of view, I assume an optimal solution would solve the problem with as few bounding rectangles as possible. – dave Feb 21 '11 at 04:04
  • @RPRPORO In that case, it's almost trivial to merge bounding rectangles if they are merge-able. – Vanwaril Feb 21 '11 at 07:50
  • @Vanwaril I see your point, although almost-optimal will never quite be optimal. (: – dave Feb 21 '11 at 17:04
1

As most 3 rectangles, everything is always going to be oriented and aligned on the x-y axis, and there is no overlap? You are in luck, there are O(n2) sets of 3 such rectangles, and it is pretty easy to enumerate them with O(n3) work. Given that you're dealing with a small enough number of black rectangles for visual display, enumerating them all and picking the best one should be more than fast enough.

First let us think about the 2 bounding rectangle case because it is simpler. It is easy to project the picture to the x-axis, and it is also easy to project the picture to the y-axis. At least one of those two projections will have a visible gap with no overlap. Therefore we can enumerate the possible ways of dividing into two rectangles by first projecting all of the black ones to line segments on the x-axis, look for the gaps, and for each gap reconstruct which pair of bounding boxes we got. Then repeat the procedure with the y-axis. And we will get them all.

Now the 3 bounding rectangle case is similar. It turns out that given 3 non-overlapping rectangles that are oriented along the x-y axis, that either the x projection or the y projection must have a visible gap. So we can do the same procedure as before, but instead of just constructing a pair of bounding boxes, we try ways to construct one bounding box, and divide the other into 2 more using the same algorithm.

(By the way you are lucky that you just wanted 3. This approach breaks down in the 4 bounding rectangle case. Because then it is possible to have 4 bounding rectangles such that neither the x-projection nor the y-projection have any visible gaps.)

So how do we take n black rectangles, project them to one axis (let's say the x-axis), and look for the sets of bounding rectangles? You just sort them, construct the maximum overlapping intervals, and find the gaps. Like this:

function find_right_boundaries_of_x_gaps (rectangles) {
  var ordered_rect = rectangles.sort(function (r1, r2) { return r1.x1 <=> r2.x2 });
  var gaps = [];
  var max_right = ordered_rect[0].x2;
  for (var i = 0; i < ordered_rect.length; i++) {
    if (max_right < ordered_rect[i].x1) {
      gaps.push(max_right);
    }
    if (max_right < ordered_rect[i].x2) {
      max_right = ordered_rect[i].x2;
    }
  }
  return gaps;
}

Given a gap it is straightforward to figure out the 2-rectangle bounding box for what lies on each side. (It is even more straightforward if you have the ordered rectangles to do it with.)

With these pieces you should now be able to write your code. Unfortunately naive approaches give you a choice between building up a lot of repetitive code, or else having to construct a lot of large data structures. However if you're comfortable with closures, you can address both problems in two very different ways.

The first is to construct closures that will, when called, iterate through the various data structures that you want. See http://perl.plover.com/Stream/stream.html for inspiration. The idea here being that you write a function which takes a set of rectangles and returns a stream of pairs of bounding boxes, then another function which takes a set of rectangles, gets the stream of pairs of bounding boxes, and returns a stream of triplets of bounding boxes. Then have a filter that takes that stream and finds the best one.

The other is inside out from that. Rather than return a function that can iterate through possibilities, pass in a function, iterate through possibilities, and call the function on each possibility. (Said function may do further iteration as well.) If you have any exposure to blocks in Ruby, this approach may make a lot of sense to you.

If you're not familiar with closures, you may wish to ignore the last few paragraphs.

btilly
  • 43,296
  • 3
  • 59
  • 88
  • 2
    Look at the comment above by BlueRaja -- its not necessary that there have to be gaps in X or Y. – Vanwaril Feb 21 '11 at 08:04
  • @Vanwaril: In a comment @louis said that there could not be any overlap. The requirement for no overlap makes my claim that there has to be a gap in X or Y true. If overlap is possible, then @BlueRaja's example is true. – btilly Feb 21 '11 at 08:22
  • Louis has rectified that -- he made that comment assuming overlaps implied there was a better solution. – Vanwaril Feb 21 '11 at 09:35
  • 1
    @Vanwaril: Indeed he did..after I made my post assuming the opposite. Ah well, at least I was very, very explicit about my explanation. – btilly Feb 21 '11 at 15:36
0

Isn't there a unique smallest bounding rectangle? Just take the max and min x- and y-coordinates among all the rectangles and make a rectangle from those specifications.

Jeffrey
  • 1,681
  • 3
  • 13
  • 23
  • But that would be for a single bounding rectangle. If you have up to 3, you can split it up and find the smallest set, but how..? – Louis Feb 20 '11 at 04:08
  • Given a set of axis-aligned rectangles, there is a unique smallest bounding rectangle. Are you saying that you are also given 3 bounding rectangles? If so, what are you supposed to do with them? I don't understand your question. – Jeffrey Feb 20 '11 at 04:15
  • No I mean most of the time having 3 bounding rectangles will produce a smaller area. But I need to work out what combination of bounding rectangles is the best (smallest area). See the diagram above. – Louis Feb 20 '11 at 04:52
0

I agree with the previous comment made by "mu is too short". One algorithm that solves your problem could partition all existing "black" rectangles into one to three geometrical clusters based on the multiplication of horizontal and vertical components of the distance between each pair of "black" rectangles (this will give you the area of a hypothetical "red" rectangle formed between each pair), and then bind each resulting cluster with a "red" rectangle.

Regardless of which geometric clustering algorithm you choose to solve that component of the problem (more on this below), it is important that you do not partition the "black" rectangles into clusters using the "straight", or euclidean distance between each pair as a parameter, as your problem involves reducing the area of bounding ("red") rectangles. As I mention in the preceding paragraph you would need to multiply the horizontal and vertical components of the distance between each pair of "black" rectangles in order to take account for the area a possible bounding "red" rectangle would cover.

There are many geometric clustering algorithms in the literature with differing time-space complexity trade-offs, I would suggest you start with this Google search and get acquainted with those. Alternatively, this problem can be solved without the use of a clustering algorithm by using a genetic, or simulated annealing algorithm, in which case, the total area of various combinations and number of possible bounding "red" rectangles would be attempted and measured in order to produce an optimal solution.

Feel free to ask for any needed clarification, and good luck with your project!

dave
  • 559
  • 3
  • 12
  • 1
    The algorithm you gave doesn't work. For example, if there are four squares diagonally ordered, then the algorithm you gave might create two bounding boxes -- one pairing the inner two, and another bounding the outer two. Also, the problem with an infinite number of bounding boxes has a *very* different complexity -- the solutions aren't related. – Vanwaril Feb 21 '11 at 07:55
  • @Vanwaril The algorithm I provided works for the relaxed version of the problem I have formulated (unlimited bounding rectangles). In the example you cite ("four squares diagonally ordered"), the algorithm will *never* bind the "inner two" rectangles, and then bind the "outer two" due to the fact that in step 2., the list "binary-clusters" is sorted in ascending order by the area resulting from binding each pair of "black" rectangles, i.e. all "black" rectangles will already be bound by the time the algorithm reaches the "outer two" rectangles, *for they will be at the end of the sorted list*. – dave Feb 21 '11 at 17:28
  • @Vanwaril The fact that the relaxed version of the problem I have stated has a different complexity than the original problem has not escaped from me. The purpose of a relaxed version of a problem is to illustrate a possible direction in solving the original problem, relaxed versions of problems are often formulated in algorithm design for *didactic* purposes, and in order to formulate admissible heuristics. – dave Feb 21 '11 at 17:35
  • My suggestion for @Louis, still is to review the literature on existing *geometric clustering algorithms* to avoid trying to reinvent the wheel, *perhaps* in light of the algorithm I have provided for a relaxed version of his problem. Alternatively, he could also investigate the possibility of applying a genetic or simulated annealing algorithm to this problem. – dave Feb 21 '11 at 18:22
  • 1
    The algorithm doesn't work. Assume 4 squares diagonally (numbered 1 thru 4), you get n^2 pair-areas. The algorithm *could* pick up (2,3) first. IF it does, then it will not take (1,2) or (3,4), since 2 and 3 exist in the binary-clusters list (step 4.1). So it ends up taking (1,4). – Vanwaril Feb 21 '11 at 18:37
  • @Vanwaril The algorithm I provided works for the relaxed version of the problem I have formulated. The algorithm cannot pick up "2,3" first due to the fact that "1,2" is added to the list "pair-areas" before "2,3" in step 1, thus, when sorted in order of ascending pair areas (step 2), "1,2" will still appear before "2,3" even though they share the same value for their area. In order to "break" my algorithm, one would need to recur to an extraneous "black" rectangle labeling scheme, e.g. labeling the "black" rectangles "3, 1, 2, 4", as opposed to "1, 2, 3, 4" as you suggested. – dave Feb 21 '11 at 19:31
  • 1
    How can you assume (1,2) is added before (2,3)? (2,3) might have a smaller area. You cannot assume the labels will be in the right order either. – Vanwaril Feb 22 '11 at 00:52
  • @Vanwaril Although one can't assume the labels will be "in the right order", one can explicitly specify a labeling scheme such as top-down, left-right, or bottom-up, right-left, when adding items to the "pair-areas" list in step 1. Regarding the case where (2,3) might have a smaller area, you are right, it "breaks" the algorithm. I'm removing the worked example and leaving the rest of my answer as is. – dave Feb 22 '11 at 16:08
  • Finally: My suggestion for @Louis, is to review the literature on existing geometric clustering algorithms to avoid trying to reinvent the wheel. Alternatively, he could also investigate the possibility of applying a genetic or simulated annealing algorithm to this problem. – dave Feb 22 '11 at 16:09