Randomly splitting a population in 6

Question

I'm working on an algortihm for particle diffusion for a cellular automaton on a triangular grid. This means every cell has 6 neighbors.

Each cell has a certain number of particles.

Each cells spreads its particles to neighboring cells in each iteration.

I'm having a lot of trouble doing this efficiently, since there are hundreds of thousands (or sometimes millions) of cells each with a large number of particles n in them (n >> 100).

I'm looking for an algorithm that splits a number randomly into 6 parts

A working but inefficient approach:

Generate as many random numbers, as there are particles at a cell, drawn from a uniform distribution on the interval (0,6).

If the number is in (0,1): spread the particle to neighbor 1.
If the number is in (1,2): spread the particle to neighbor 2.
If the number is in (2,3): spread the particle to neighbor 3.
etc...

This works for a 'small' number of particles (n < 50), but gets very computationally intensive.

My theoretical approach:

Call the number of particles to be distributed n.

Generate 5 random numbers drawn from a normal (gaussian) distribution with mean 0 and variance 1. Call these numbers r0, r1, r2, r3, r4

r0 = n/2 + r0*(n/4) // this transforms r0 to a random number drawn from a normal distribution with mean n/2 and variance n/2

r0 effectively splits the population of n particles into two groups, each to be distributed to three neighbors each. One of size r0, one of size n - r0

r1 = r0/3 + r0*(r0/9) // this transforms r1 to a random number drawn from a normal distribution with mean r0/3 and variance r0/3

r1 effectively splits the population of r0 particles into two groups, one to be distributed to a single neighbor the other to be distributed to two neighbors. The former of size r1, the latter of size r0 - r1

r2 = (r0 - r1)/2 + r2*((r0 - r1)/4) // this transforms r2 to a random number drawn from a normal distribution with mean (r0 - r1)/2 and variance (r0 - r1)/2

r2 effectivle splits the population of (r0 - r1) particles into two groups, each to be distributed to a single neighbor.

The numbers r0, r1 and r2 should now have split off 3 random parts off the population of n particles, each according to a normal distribution.

Continue in the same fashion, splitting the population (n - r0) into three parts.

The approach seems to make sense to me, but I believe my variances might be way off here, so I'm getting strange results, where too many particles are getting 'split off' to one group of neighbors and none being left over for the other neighbors. This introduces strange looking anisotropic effects.

Background: The combination of many uniform distribution is well approximated by a gaussian one. This algorithm is a try to modify an algorithm described by Bastien Chopard in "Cellular Automata Modeling of physical systems" chapter 5.7 (page 213)

Any help in seeing a mistake in my approach or a different, similarly efficient one would be very much appreciated.

I haven't specified the language im coding in, because I'm just looking for an algorithm in general. I'm using java (Processing 3.5), but if you can provide in any language whatsoever, that is fine with me.

The keyword you're probably looking for is ["multinomial distribution"](https://en.wikipedia.org/wiki/Multinomial_distribution). — Ilmari Karonen, Aug 01 '20 at 14:35
You will likely get more educated answers in [Cross Validated.SE](https://stats.stackexchange.com/) (Note: Before asking there, go through the site guide lines to see if it indeed fits there, I am not very active in cross validated I am afraid, and am not sure my suggestion is indeed correct). — amit, Aug 01 '20 at 14:58

David Eisenstat · Accepted Answer · 2020-08-01T16:41:46.050

As far as I can tell from reviewing the literature, the state of the art is an algorithm due to Brown and Bromberg ("An Efficient Two-Stage Procedure for Generating Random Variates from the Multinomial Distribution", from 1983), which specializes to your problem like this.

For some constant c, sample six independent Poisson variates with rate cn. If their sum is greater than n, redraw them until their sum is less than or equal to n. Spread particles as dictated by these variates, and spread the rest using your inefficient approach. c should be in the ballpark of 1/6; too high and we draw the variates too many times, too low and we make too much work for the inefficient algorithm.

Since you need a ton of samples over the life of the simulation, we don't have to work too hard to reduce the cost of setting up to draw Poisson variates. For each possible n, set up with the alias method to sample the truncated Poisson distribution with rate cn given that the outcome does not exceed n.

Randomly splitting a population in 6

1 Answers1