2

It's a sort of an algorithmic question, without a bond to any of the particular languages.

Let's say I have Np point particles with continuous (read double) x, y coordinates on a 2d plane. The 2d plane is divided into N ⨉ N cells.

For each particle I want a quick way (faster than O(Np^2)) to find other particles in the same cell. Also, I don't want to go too far in memory usage, so I don't want another N ⨉ N + Np array to store.

I "invented" a tricky way to realize this, but I'm asking this question just in case there's a canonical way of doing this.

hayk
  • 388
  • 1
  • 5
  • 20
  • Add a field to each particle that says what cell it's in? – SirGuy Feb 23 '17 at 20:26
  • Maybe he is looking for Voronoi diagrams... – BitTickler Feb 23 '17 at 20:29
  • @SirGuy, but then you need to parse all the particles `Np^2` times. – hayk Feb 23 '17 at 20:31
  • Is important to be able to cheaply update the data structure used, for say, after the particles have moved changing cells? Or is it just the querying that should be optimized? – snow_abstraction Feb 23 '17 at 20:32
  • @BitTickler, could you explain how Voronoi diagrams might be exploited in this case? – hayk Feb 23 '17 at 20:32
  • @snow_abstraction, I'm going to do this process at each timestep, but how does that make difference? – hayk Feb 23 '17 at 20:34
  • @HaykHakobyan, yes. In that case, depending on the data structure used, you'll need fast insertion and delete. For instance, if the algorithm moves particles between various lists/arrays then those might need to be resized or allocated intelligently to avoid expensive memory allocations/deallocations. – snow_abstraction Feb 23 '17 at 20:58

4 Answers4

1

The canonical way to do this is to use a spatial indexing data structure, e.g. a kd-tree, with O(Np*log(Np)) construction time, O(Np^(1−1/K)+Mp) axis-aligned range (which is your cell) query time (K=2 dimensions, Mp points reported), O(Np) space.

yuri kilochek
  • 12,709
  • 2
  • 32
  • 59
1

Here is a solution with O(Np * log(Np)) time and O(Np) memory:

Initialize a dynamic DS container with {row,col} tuple as a key \
    and a list of particles as a value
Iterate over each particle
    Find {row, col} tuple for current particle
    Find a value-list in container by {row, col} key
    If there is no value in container for a value by this key
        Then initialise a new particle list
    Append current particle to a value-list

Container may be implemented as a balanced binary tree, which will give log(Np) multiplier to overall time complexity.


Another way to solve with O(Np + N) time and O(N) memory solution:

Initialize a simple lookup array byRow of size N, \
    it will contain a list of particles in each cell
Iterate over each particle
    Place the particle in corresponding cell of lookup array byRow by its ROW
Initialize another lookup array byCol of size N, \
    it will contain a list of particles in each cell as well
Iterate over each cell of lookup list byRow 
    Iterate over each particle of the list in byRow[cellRow]
        Place the particle in corresponding cell of byCol by its COL
    Iterate over each particle of the list in byRow[cellRow]
        \\ Now you have a list of other particles in the same NxN cell
        \\ by looking at byCol[particleCol]
        If byCol[particleCol] is not cleared
            Print byCol[particleCol] list or put into other global storage and use later \
        Clear byCol[particleCol] list

The idea is very simple. First you group particles by row storing them in lists of byRow array. Then for particles of every list of byRow array you make the same grouping by column. Each time you are reusing byCol array. So overall memory complexity is O(N). Even we have two loops nested one in other we still have O(Np + N) time complexity because no inner step will be executed more than Np times.

Edit: Time complexity is O(Np + N) to be precise.

Ivan Gritsenko
  • 4,166
  • 2
  • 20
  • 34
  • But then you'll need `N⨉N+Np` memory, to store the particles in that container, right? That's a bad option if you have millions of particles flying around. – hayk Feb 23 '17 at 20:54
  • 1
    @Hayk Hakobyan, no you don't. First step of the algo does not create a list of particle for every possible {row,col} tuple. This list is created in the body loop. You only create a value-list when you **actually** need it. At most you will have Np such lists (with a single particle each), in case all {row, col} are different for all particles. – Ivan Gritsenko Feb 23 '17 at 20:57
  • Ok I think I got it. I had a similar idea, but instead of key-value list I thought of giving each particle an index (sorted properly according to their cells) and storing the total number of particles in a given cell. This is `N^2` in memory and `Np` (`Np log(Np)`?) in time. – hayk Feb 23 '17 at 21:03
  • @Hayk Hakobyan, If you will use `N^2` memory you can perform in `O(Np)` time. So my solution uses obly `O(Np)` memory but works `log(Np)` times slower. – Ivan Gritsenko Feb 23 '17 at 21:10
  • This algorithm takes O(Np + N^2) time since you have to iterate over byCol for every entry in byRow. This is fine, though, as long as you choose N to be O(sqrt(Np)). – Running Wild Feb 23 '17 at 22:09
  • @Running Wild, i'm not iterating over byCol, I'm iterating over particles of a particular row of byRow array. So it's O(Np). – Ivan Gritsenko Feb 23 '17 at 22:11
  • At the very least you have to clear byCol for every iteration of byRow, how else would you do that without O(N^2) work? – Running Wild Feb 24 '17 at 15:02
  • @Running Wild, that's right, I have to clear byCol for every iteration of byRow. `Clear byCol[particleCol] list` is doing exactly that. But here is a trick, why should I iterate over each element of byCol (including the empty ones) if I can just clear directly those elements where once I have put a particle. (so I'm iterating over particles clearing their lists). – Ivan Gritsenko Feb 24 '17 at 15:19
0

Construct a list (or array) of sorted tuples of (used cell id, a list of particles in that cell) sorted by the cell id. The cell id could just be its (x, y) coordinates. The space complexity is O(Np). Constructing it should take O(Np log(Np)) time. Looking up the particles in the same cell is then O(log(Np)) via standard binary search.

To replace log(Np) by 1 in these complexity estimates to get O(Np) construction time and O(1) look-up, replace the sorted list with a hash table.

snow_abstraction
  • 408
  • 6
  • 13
0

There's no real answer other than to have a list of particles attached to each cell. Which is another N x N + p data structure. However memory is cheap and you should be able to afford it.

Malcolm McLean
  • 6,258
  • 1
  • 17
  • 18