0

I know some of the range searching data structure, for example kd-tree, Range tree and quad-tree. But all the implementation is in memory, how can I implementation them on secondary memory with high performance I/O efficiency?

Here is the condition:

1): a static set of points on the two dimension.

2): just for query, no inset or delete.

3): adapt for secondary memory.

Thanks.

Zhao Yuhao
  • 11
  • 1

1 Answers1

2

If you can fit the tree into memory during construction:

  1. Build the kd-tree.

  2. Bottom, up, collect as many points as possible that fit into a block of your hardware size.

  3. Write the data to this block.

  4. Repeat 2.-3. recursively, until you've written all data to disk.

When querying, load a page from disk, process this part of the tree until you reach a reference to another page. Then load this page and continue there.

Alternatively, you can do the same top-down, but then you will likely need more disk space. In above approach, only the root page can be near-empty.

Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
  • I know what you mean. First it sounds like very hard to implement it, right? Second there are many details to care about. For example, how to maintain the tree shape in a disk page. – Zhao Yuhao Dec 09 '14 at 03:17
  • Yes, it's not easy. The R-tree is much nicer, because it is already disk-page-oriented. – Has QUIT--Anony-Mousse Dec 09 '14 at 08:57