1

For a group project I'm currently part of, we have to simulate the following: Take a square with side length n. Distribute some amount of unit disks uniformly over the square. Find the number of disks required until there is a connected component of disks stretching from the left side to the right side of the square. Then find the number of disks required until the square is entirely filled with disks.

It's not explicitly stated, but we assume this is to be done in Matlab, as we use it in other parts of this course. For the first problem, finding a path from the left to the right, I've written two methods that work. One uses an adjacency list and the graph tools in Matlab for finding connected nodes. This method is fast enough but takes up far too much memory for what we need to do. The other method uses a recursive search algorithm without storing adjacency information, but is far too slow.

The problem arises when we need the square to be of sizes n=1000 and n=10 000. We predict this will require some tens of millions of circles, or more, and we simply do not see how we're supposed to handle this as any adjacency list or matrix would be ridiculously large, and not using one seems to require a huge amount of time. Any thoughts and ideas are appreciated, thanks

Community
  • 1
  • 1
Hufsa
  • 13
  • 2
  • 1
    A) find a different algorithm that does not require that much memory. B) get more RAM. – Ander Biguri May 05 '17 at 10:29
  • it's a probabilistic problem, there's no "number of disks" which solves it, right? could you define the mathematical problem more clearly (e.g. "entirely filled")? and it will be better if you could supply a toy example. – user2999345 May 05 '17 at 10:55
  • @user2999345 Yes, it's very much probabilistic - this project is part of a course in probability and statistics :) Already got an answer down below, but thanks anyway! And just if you're interested, we are supposed to simulate this a large number of times for each _n_, to find some expected number of disks required, and analyze the data a bit – Hufsa May 06 '17 at 20:51

1 Answers1

0

Lets reformulate your problems as decision problems like this:

Starting with seed s for the PRNG, randomly distribute m unit disks in an n x n square, and determine whether or not they make a connected path from the left side to the right.

and:

Starting with seed s for the PRNG, randomly distribute m unit disks in an n x n square, and determine whether or not they cover the entire square.

If you can solve these decision problems, then you can find the minimum value for m by doing a binary search over the possible values.

You can solve these decision problems without using a whole lot of RAM by generating your disks like so:

  1. Initialize your PRNG with the given seed s
  2. For each disk 1 ... m, randomly select the column it will go in (that's floor(center.x)). Accumulate counts for all the columns to determine how many disks will go in each one. Let COUNT(col) be the number of disks to generate in column col
  3. For each col in 0 ... n-1, generate the COUNT(col) disks uniformly distributed within that column.

Now you are generating your disks predominantly from left to right. Both of the above decision problems have solutions that work from left to right, and only need to see the disks in one or two columns at a time, so you only need to remember one or two columns worth of disks.

For the full coverage problem, work from left to right and determine whether each column in the squrare is completely covered, using disk from that column and adjacent columns. No other disks could possibly reach.

For the left-right path problem, work from left to right using union-find to keep track of which disks in the current column are connected to the left side and each other. When you get to the last column you can check to see if any disks in the last column are connected to the left side and extend past the right.

Matt Timmermans
  • 53,709
  • 3
  • 46
  • 87