5

I am using the spatialindex library from http://libspatialindex.github.com/

I am creating an R* tree in the main memory:

size_t capacity = 10;
bool bWriteThrough = false;
fileInMem = StorageManager
    ::createNewRandomEvictionsBuffer(*memStorage, capacity, bWriteThrough);

double fillFactor = 0.7;
size_t indexCapacity = 10;
size_t leafCapacity = 10;
size_t dimension = 2;
RTree::RTreeVariant rv = RTree::RV_RSTAR;
tree = RTree::createNewRTree(*fileInMem, fillFactor, indexCapacity,
   leafCapacity, dimension, rv, indexIdentifier);

Then I am inserting a large number of bounding boxes, currently some 2.5M (road network of Bavaria in Germany). Later I'll aim at inserting all roads of Europe.

What are good choice of parameters for the storage manager and rtree? Mostly I am using the rtree to find the closest roads to a given query (bbox intersection).

arthur
  • 1,034
  • 2
  • 17
  • 31
  • +1 Because I had never heard of an R* tree before, and it's pretty interesting. http://en.wikipedia.org/wiki/R*_tree – John Dibling Oct 25 '12 at 15:26
  • well, it is a pretty standard index structure, used to answer space range queries (give me all objects that fall into a given rectangle query) and nearest-neighbor queries (give me the k-closest objects to my point query). If you are interested, have a look at spatial indexes in general: http://en.wikipedia.org/wiki/Spatial_database – arthur Oct 25 '12 at 15:57

1 Answers1

3

As your data is static, a good bulk load may work for you. The most popular (and a rather simple) bluk load is Sort-Tile-Recursive. However, it is somewhat designed around point data. As you are inserting spatial objects, it may or may not work as well.

If you are using a bulk load, it will no longer be an R*-tree, but a plain R-tree.

Capacity 10 sounds way too little to me. You want a much larger fan-out. But you'll need to benchmark, this is data set and query dependant what is good. I'd definitely try 100 or more.

Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
  • Thanks for the comments and info. I already increased the both capacity values to 100. This builds a tree of height 3. I'll try to experiments with the values and post them here in a few days.. – arthur Nov 05 '12 at 14:37
  • According to what Mario's explained in http://lists.gispython.org/pipermail/spatialindex/2013-June/000354.html, the bulk loaded tree is neither plain R-tree, nor R*-tree. – mloskot Jun 10 '13 at 12:07
  • Then that mail is imprecise. Bulk-loading an R-tree with STR will not produce the *exact same* tree as an incrementally loaded R-tree (for obvious reasons, e.g. the fill rate being much higher than with an incremental loaded tree), but nevertheless it is a valid R- *and* R*-tree and can both be queried and updated the same way. – Has QUIT--Anony-Mousse Jun 10 '13 at 12:24