1

I'm on an odyssey to find a good tree structure to store and update my application data.

  • The data are positions in 3 dimensions (x, y, z)
  • They need to be able to be updated and queried by range quickly (every 30 milliseconds). The queries would be, for example: "get all the points around (2,3,4) in a radius of 100cm"
  • The data is always in internal memory.

Could someone of you recommend me a good type of tree that meets these requirements?

The KD-Trees wouldn't work for me because they are not made to be updated at this speed. I should rebuild them whole on every update. BKD-Trees wouldn't work for me either because they are made to store data on disk (not in internal memory). Apparently the R-Trees are also designed to store the data in the leaves.

PacaPaw
  • 13
  • 1
  • 4
  • "Apparently the R-Trees are also designed to store the data in the leaves (on disk).". So, since you want to store data on disk, why are you not using R-trees? I think they are the tool for the job (specifically R-Star-trees). – TilmannZ Dec 21 '22 at 20:57
  • Sorry if i wasn't clear enough in the original post. The data is always stored and queried in internal memory, not disk or external storage. My english is not the best. – PacaPaw Dec 22 '22 at 12:05

1 Answers1

1

If you need fast updates as well as range queries, in-memory, I can recommend either a grid index or the PH-tree.

A grid index is essentially an 2D/3D array of buckets. The grid is laid over your data space and you just store your data in the bucket (=grid cell) where your point is. For range queries you just check all entries in all buckets that overlap with your query range. It takes a bit trial and error to find the best grid size. In my experience this is the best solution in 2D with 1000 points or less. I have no experience with 3D grid indexes.

For larger datasets I recommend the PH-tree (disclaimer: self advertisement). Updates are much faster than with R-trees, deletion is as fast as insertion. There is no rebalancing (as it happens with R-trees or some kd-trees) so insertion/deletion times are quite predictable (rebalancing is neither need nor possible, imbalance is inherently limited). Range queries (= window queries) are a bit slower than R-trees, but the difference almost disappears for very small ranges (windows). It is available in Java and C++.

TilmannZ
  • 1,784
  • 11
  • 18
  • This sounds exactly like what I'm needing! I'm glad you introduced me to the PH-tree because I hadn't found it on my own. I'm going to be testing the implementation in C++. Thanks!! – PacaPaw Dec 30 '22 at 15:10