Now I have a set of data in text file (big enough), suppose each line represents a rectangle:
x1,y1,x2,y2
After I read the file, how do I bulk load and build R-tree index using http://www.vividsolutions.com/jts/javadoc/index.html?
I checked its APIs, it seems that only insert
can used when bulk loading.
Here is my test code:
STRtree rtree = new STRtree();
rtree.insert(new Envelope(1.0,2.0,1.2,3.4),new Integer(1));
rtree.insert(new Envelope(4.0,3.2,1.9,4.4),new Integer(2));
rtree.insert(new Envelope(3.4,3.8,2.2,5.2),new Integer(3));
rtree.insert(new Envelope(2.1,5.3,5.2,3.6),new Integer(4));
rtree.insert(new Envelope(4.2,2.2,2.9,10.3),new Integer(5));
List<Object> list = rtree.query(new Envelope(1.4,5.6,2.0,3.0));
Is it the right way of building a R-tree index (just use insert
method)?
Another question is,
suppose the input file is big enough, for example, GB or even TB scale, stored in HDFS
, in this case, I would like a parallel version of code above based on Apache Spark
.
Last, Any idea of saving the R-tree into a file for storage, and good for recover for later use?
Edit:
Now I read HDFS
file to build index, here is my code:
val inputDataPath = "hdfs://localhost:9000/user/chenzhongpu/testData.dat"
val conf = new SparkConf().setAppName("Range Query")
// notice that: the function names for queries differ accoss systems.
// here we simply refer intersect.
val sc = new SparkContext(conf)
val inputData = sc.textFile(inputDataPath).cache()
val strtree = new STRtree
inputData.foreach(line => {val array = line.split(",").map(_.toDouble); strtree.insert(new Envelope(array(0),array(1),array(2),array(3)),
new Rectangle(array(0),array(1),array(2),array(3)))})
I called insert
in foreach
, and when I print the size of strtree
, is zero!
Why the insert
method inside foreach
doesn't work ? Did I miss something?