15

This is an interview question, the interview has been done.

Given a deck of rectangular cards, put them randomly on a rectangular table whose size is much larger than the total sum of cards' size. Some cards may overlap with each other randomly. Design an algorithm that can calculate the area the table covered by all cards and also analyze the time complexity of the algorithm. All coordinates of each vertex of all cards are known. The cards can overlap in any patterns.

My idea:

Sort the cards by its vertical coordinate descending order.

Scan the cards vertically from top to bottom after reaching an edge or vertices of a card, go on scanning with another scan line until it reached another edge, and find the area located between the two lines . Finally, sum all area located between two lines and get the result.

But, how to compute the area located between two lines is a problem if the area is irregular.

Any help is appreciated. thanks !

andand
  • 17,134
  • 11
  • 53
  • 79
user1002288
  • 4,860
  • 10
  • 50
  • 78
  • 3
    are they all orientated the same way (i.e. the cards will never be rotated at various angles etc.)? – Nim Mar 28 '12 at 15:12
  • 1
    Compute triangles from card vertices and calculate the area occupied by these triangles, factoring for overlaps. – Justin Pearce Mar 28 '12 at 15:37
  • 3
    Why are they triangles ? they can be overlapped any patterns. thanks ! – user1002288 Mar 28 '12 at 16:09
  • 1
    The question specifically says the table is "much larger than the total sum of the cards' size". So I think the problem is just asking you to efficiently _find_ overlapping cards and then use whatever code you like to compute their intersection. (It is OK for the latter to be slow because there are very very few overlapping cards with very very high probability). – Nemo Mar 29 '12 at 01:58
  • @user1002288 Because every polygon can be decomposed into triangles. That's only relevant if the answer to Nim's question is that they _can_ be rotated - otherwise, there are much simpler solutions. – Nick Johnson Mar 29 '12 at 09:32
  • -1 for not clarifying the Nim's question - very important. – Tomas Mar 29 '12 at 20:36

5 Answers5

8

This could be done easily using the union-intersection formula (size of A union B union C = A + B + C - AB - AC - BC + ABC, etc), but that would result in an O(n!) algorithm. There is another, more complicated way that results in O(n^2 (log n)^2).


Store each card as a polygon + its area in a list. Compare each polygon in the list to every other polygon. If they intersect, remove them both from the list, and add their union to the list. Continue until no polygons intersect. Sum their areas to find the total area.

The polygons can be concave and have holes, so computing their intersection is not easy. However, there are algorithms (and libraries) available to compute it in O(k log k), where k is the number of vertices. Since the number of vertices can be on the order of n, this means computing the intersection is O(n log n).

Comparing every polygon to every other polygon is O(n^2). However, we can use an O(n log n) sweeping algorithm to find nearest polygons instead, making the overall algorithm O((n log n)^2) = O(n^2 (log n)^2).

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
BlueRaja - Danny Pflughoeft
  • 84,206
  • 33
  • 197
  • 283
  • 1
    To the first part of your answer--easily how? – Colonel Panic Mar 28 '12 at 18:11
  • 3
    @Matt: Sum the area of all cards; subtract the area of the intersection of all pairs; add intersection of all triplets; subtract quadruplets etc. The intersections will have no holes and [be convex](http://wiki.answers.com/Q/Intersection_of_two_convex_set_is_convex), so finding them is [much](http://cgafaq.info/wiki/Intersection_Of_Convex_Polygons) [easier](http://www.iro.umontreal.ca/~plante/compGeom/algorithm.html) than in the more complex case. – BlueRaja - Danny Pflughoeft Mar 28 '12 at 20:54
  • 1
    Cards are convex. The intersection of convex shapes is convex. Got it, thanks. – Colonel Panic Mar 29 '12 at 10:15
6

This is almost certainly not what your interviewers were looking for, but I'd've proposed it just to see what they said in response:

I'm assuming that all cards are the same size and are strictly rectangular with no holes, but that they are placed randomly in an X,Y sense and also oriented randomly in a theta sense. Therefore, each card is characterized by a triple (x,y,theta) or of course you also have your quad of corner locations. With this information, we can do a monte carlo analysis fairly simply.

Simply generate a number of points at random on the surface of the table, and determine, by using the list, whether or not each point is covered by any card. If yes, keep it; if not, throw it out. Calculate the area of the cards by the ratio of kept points to total points.

Obviously, you can test each point in O(n) where n is the number of cards. However, there is a slick little technique that I think applies here, and I think will speed things up. You can grid out your table top with an appropriate grid size (related to the size of the cards) and pre-process the cards to figure out which grids they could possibly be in. (You can over-estimate by pre-processing the cards as though they were circular disks with a diameter going between opposite corners.) Now build up a hash table with the keys as grid locations and the contents of each being any possible card that could possibly overlap that grid. (Cards will appear in multiple grids.)

Now every time you need to include or exclude a point, you don't need to check each card, but only the pre-processed cards that could possibly be in your point's grid location.

There's a lot to be said for this method:

  • You can pretty easily change it up to work with non-rectangular cards, esp if they're convex
  • You can probably change it up to work with differently sized or shaped cards, if you have to (and in that case, the geometry really gets annoying)
  • If you're interviewing at a place that does scientific or engineering work, they'll love it
  • It parallelizes really well
  • It's so cool!!

On the other hand:

  • It's an approximation technique (but you can run to any precision you like!)
  • You're in the land of expected runtimes, not deterministic runtimes
  • Someone might actually ask you detailed questions about Monte Carlo
  • If they're not familiar with Monte Carlo, they might think you're making stuff up

I wish I could take credit for this idea, but alas, I picked it up from a paper calculating surface areas of proteins based on the position and sizes of the atoms in the proteins. (Same basic idea, except now we had a 3D grid in 3-space, and the cards really were disks. We'd go through and for each atom, generate a bunch of points on its surface and see if they were or were not interior to any other atoms.)

EDIT: It occurs to me that the original problem stipulates that the total table area is much larger than the total card area. In this case, an appropriate grid size means that a majority of the grids must be unoccupied. You can also pre-process grid locations, once your hash table is built up, and eliminate those entirely, only generating points inside possibly occupied grid locations. (Basically, perform individual MC estimates on each potentially occluded grid location.)

Novak
  • 4,687
  • 2
  • 26
  • 64
  • Very clever. I like the fact that it's a lot simpler to implement than solutions that require decomposing polygons, assuming you're willing to tolerate an approximation. – Nick Johnson Mar 29 '12 at 09:35
2

Here's an idea that is not perfect but is practically useful. You design an algorithm that depends on an accuracy measure epsilon (eps). Imagine you split the space into squares of size eps x eps. Now you want to count the number of squares lying inside the cards. Let the number of cards be n and let the sides of the cards be h and w.

Here is a naive way to do it:

S = {} // Hashset
for every card:
   for x in [min x value of card, max x value of card] step eps:
       for y in [min y value of card, max y value of card] step eps:
           if (x, y) is in the card:
               S.add((x, y))
return size(S) * eps * eps

The algorithm runs in O(n * (S/eps)^2) and the error is strongly bounded by (2 * S * n * eps), therefore the relative error is at most (2 * eps * n / S).

So for example, to guarantee an error of less than 1%, you have to choose eps less than S / (200 n) and the algorithm runs in about 200^2 * n^3 steps.

aelguindy
  • 3,703
  • 24
  • 31
  • 1
    Isn't this basic raycasting ? – Stefano Borini Mar 28 '12 at 16:17
  • @StefanoBorini I don't know what raycasting is, but I would be surprised if what I describe is not already being done (and maybe given a name too) :-). – aelguindy Mar 28 '12 at 16:20
  • You divide the surface in cells, shoot an imaginary ray from each cell center, and check if there's an intersection with an object or not. It's a technique used to make ray tracing images (at least, the most basic ones). – Stefano Borini Mar 28 '12 at 16:31
  • 1
    This isn't raycasting, it's 'numerical integration', or 'area-estimation by point-counting'. Then again, it might as well be called raycasting too. – High Performance Mark Mar 28 '12 at 17:01
  • A lot of graphics techniques are closely related to numerical integration techniques. This is one of them. When you're using grid points in this fashion, I think it's usually called a quadrature technique. When you're using random points, it's Monte Carlo. Monte Carlo will have better convergence properties in higher dimensional spaces. – Novak Mar 28 '12 at 19:59
1

Suppose there are n cards of unit area. Let T be the area of the table. For the discretised problem, the expected area covered will be

$ T(1-({{T-1}\over{T}})^n) $

Community
  • 1
  • 1
Colonel Panic
  • 132,665
  • 89
  • 401
  • 465
0

T = The total area of the table.

C = The total area that could be covered by cards (area of one card times number of cards).

V = The total area of overlapping cards (V = oVerlap)

Area to calculate = T - (C - V)

There should be (yep, those are danger words) some way to efficiently analyze the space occupied by the cards, to readily identify overlapping vs. non-overlapping situations. Identify these, factor out all overlapped areas, and you're done.

Time complexity would be in considering each card in order, one by one, and comparing each with each remaining card (card 2 has already been checked against card 1), which makes it n!, not good... but this is where the "should" comes in. There must be some efficient way to remove all cards that do not overlap from consideration, to order cards to make it obvious if they could not possibly overlap other/prior cards, and perhaps to identify or group potentially overlapping cards.

Interesting problem.

Philip Kelley
  • 39,426
  • 11
  • 57
  • 92
  • 1
    Calculating the "total area of overlapping cards" does not work, since more than two cards can overlap the same space. Imagine for example, all cards are in exactly the same place - then the "total area of overlapping cards" is equal to the area of one card! Also, your formula is clearly incorrect - imagine T is very large, then `T - (C - V)` could be larger than `C`!! – BlueRaja - Danny Pflughoeft Mar 28 '12 at 16:00
  • +1: this is basically right except it calculates the area not covered rather than the area covered (a trivial issue). V here needs to be interpreted correctly, so that if two cards overlap, subtract the area of this overlap once, and then if a third card is put on to exactly cover this overlap, subtract the area of overlap one more time, etc. That is, if all cards are stacked, C = 52*A, and V=51*A, or the area of one card, which is correct. – tom10 Mar 28 '12 at 16:26
  • That's why you (painfully) go through card by card. The "first" (or lowest) card covers an area, while all subsequent (stacked) cards that overlap it do not--for the area in which they overlap. – Philip Kelley Mar 28 '12 at 16:38