14

Is there any way we can query and get location data using mongodb geospatial query that matches the following criteria?

  • Getting all locations that are part of intersection between two boxes or in general two polygons.

For example below can we get in query output only those locations that are within the yellow area which actually is the common area for the purple and red geometric objects [ polygons ] ?

enter image description here

My study of mongodb document so far

Use case

    db.places.find( {
   loc: { $geoWithin: { $box:  [ [ 0, 0 ], [ 100, 100 ] ] } }
} )

Above query provides results within a rectangle geometric area [ I am looking for locations that are common to two such individual queries ]

    db.places.find( {
   loc: { $geoWithin: { $box:  [ [ 0, 0 ], [ 100, 100 ] ] } }
} )

    db.places.find( {
   loc: { $geoWithin: { $box:  [ [ 50, 50 ], [ 90, 120 ] ] } }
} )
Neil Lunn
  • 148,042
  • 36
  • 346
  • 317
Viraj
  • 357
  • 2
  • 13
  • Does this help? -http://docs.mongodb.org/manual/reference/operator/query/geoIntersects/#example. Please include sample documents in your collection depicting coordinates. – BatScream Dec 29 '14 at 07:09
  • @BatScream The sample document is as follows: { "_id" : "35004", "city" : "ACMAR", "loc" : [ -86.51557, 33.584132 ], "pop" : 6055, "state" : "AL" } – Viraj Dec 29 '14 at 07:28
  • @BatScream Thanks for the response. Actually the mongodb doc link you gave provides results based on polygon. What I am looking for something like this. 1. First the query gets results within box A. 2. Second the query gets results within box B. 3. Third the query outputs the results that are common to both box A and B which means the query provides intersection to two boxes or maybe in general two polygons :D. Is there a way to do it? – Viraj Dec 29 '14 at 07:47
  • If I get this right, then what you are asking for is the resulting "polygon" from the intersection of "multiple" polygon definitions. Yes? If so then MongoDB is not going to do this by itself with the Geo functions it has. You need to "work out" what the polygon is from the "intersection" first, then pass that to [**`$geoWithin`**](http://docs.mongodb.org/manual/reference/operator/query/geoWithin/) to find data falling withing that "polygon result". Is that what you want? – Neil Lunn Dec 29 '14 at 08:58
  • @NeilLunn Yes :) just wanted to make sure if there is any direct geo spatial query that works this way... so far I havent found any such direct query :( – Viraj Dec 29 '14 at 09:27
  • No there ain't. What you want is the "resulting polygon" from the "intersection". But you need those co-ordinates first. MongoDB does not do this. Then you want to use those "polygon" co-ordinates as a query with either **`$geoWithin`** or **`$geoIntersects`**, as is appropriate to your purpose. So not MongoDB by itself, but have faith, someone should take the time to show you how to calculate the co-ordinates. If you don't work that out yourself that is. An answer is still valid here. And useful. – Neil Lunn Dec 29 '14 at 09:31
  • Actually I eat my words. In the middle of putting up an example for getting the intersection via library, I saw a clear way to do it. I still say you should do that externally, but at least there is an answer to explain the options. – Neil Lunn Dec 30 '14 at 05:26

2 Answers2

9

So looking at this with a fresh mind the answer is staring me in the face. The key thing that you have already stated is that you want to find the "intersection" of two queries in a single response.

Another way to look at this is you want all of the points bound by the first query to then be "input" for the second query, and so on as required. That is essentially what an intersection does, but the logic is actually literal.

So just use the aggregation framework to chain the matching queries. For a simple example, consider the following documents:

{ "loc" : { "type" : "Point", "coordinates" : [ 4, 4 ] } }
{ "loc" : { "type" : "Point", "coordinates" : [ 8, 8 ] } }
{ "loc" : { "type" : "Point", "coordinates" : [ 12, 12 ] } }

And the chained aggregation pipeline, just two queries:

db.geotest.aggregate([
    { "$match": {
        "loc": {
            "$geoWithin": {
                "$box": [ [0,0], [10,10] ]
            }
        }
    }},
    { "$match": {
        "loc": {
            "$geoWithin": {
                "$box": [ [5,5], [20,20] ]
            }
        }
    }}
])

So if you consider that logically, the first result will find the points that fall within the bounds of the initial box or the first two items. Those results are then acted on by the second query, and since the new box bounds start at [5,5] that excludes the first point. The third point was already excluded, but if the box restrictions were reversed then the result would be the same middle document only.

How this works in quite unique to the $geoWithin query operator as compared to various other geo functions:

$geoWithin does not require a geospatial index. However, a geospatial index will improve query performance. Both 2dsphere and 2d geospatial indexes support $geoWithin.

So the results are both good and bad. Good in that you can do this type of operation without an index in place, but bad because once the aggregation pipeline has altered the collection results after the first query operation the no further index can be used. So any performance benefit of an index is lost on merging the "set" results from anything after the initial Polygon/MultiPolygon as supported.


For this reason I would still recommend that you calculate the intersection bounds "outside" of the query issued to MongoDB. Even though the aggregation framework can do this due to the "chained" nature of the pipeline, and even though resulting intersections will get smaller and smaller, your best performance is a single query with the correct bounds that can use all of the index benefits.

There are various methods for doing that, but for reference here is an implementation using the JSTS library, which is a JavaScript port of the popular JTS library for Java. There may be others or other language ports, but this has simple GeoJSON parsing and built in methods for such things as getting the intersection bounds:

var async = require('async');
    util = require('util'),
    jsts = require('jsts'),
    mongo = require('mongodb'),
    MongoClient = mongo.MongoClient;

var parser = new jsts.io.GeoJSONParser();

var polys= [
  {
    type: 'Polygon',
    coordinates: [[
      [ 0, 0 ], [ 0, 10 ], [ 10, 10 ], [ 10, 0 ], [ 0, 0 ]
    ]]
  },
  {
    type: 'Polygon',
    coordinates: [[
      [ 5, 5 ], [ 5, 20 ], [ 20, 20 ], [ 20, 5 ], [ 5, 5 ]
    ]]
  }
];

var points = [
  { type: 'Point', coordinates: [ 4, 4 ]  },
  { type: 'Point', coordinates: [ 8, 8 ]  },
  { type: 'Point', coordinates: [ 12, 12 ] }
];

MongoClient.connect('mongodb://localhost/test',function(err,db) {

  db.collection('geotest',function(err,geo) {

    if (err) throw err;

    async.series(
      [
        // Insert some data
        function(callback) {
          var bulk = geo.initializeOrderedBulkOp();
          bulk.find({}).remove();
          async.each(points,function(point,callback) {
            bulk.insert({ "loc": point });
            callback();
          },function(err) {
            bulk.execute(callback);
          });
        },

        // Run each version of the query
        function(callback) {
          async.parallel(
            [
              // Aggregation
              function(callback) {
                var pipeline = [];
                polys.forEach(function(poly) {
                  pipeline.push({
                    "$match": {
                      "loc": {
                        "$geoWithin": {
                          "$geometry": poly
                        }
                      }
                    }
                  });
                });

                geo.aggregate(pipeline,callback);
              },

              // Using external set resolution
              function(callback) {
                var geos = polys.map(function(poly) {
                  return parser.read( poly );
                });

                var bounds = geos[0];

                for ( var x=1; x<geos.length; x++ ) {
                  bounds = bounds.intersection( geos[x] );
                }

                var coords = parser.write( bounds );

                geo.find({
                  "loc": {
                    "$geoWithin": {
                      "$geometry": coords
                    }
                  }
                }).toArray(callback);
              }
            ],
            callback
          );
        }
      ],
      function(err,results) {
        if (err) throw err;
        console.log(
          util.inspect( results.slice(-1), false, 12, true ) );
        db.close();
      }
    );

  });

});

Using the full GeoJSON "Polygon" representations there as this translates to what JTS can understand and work with. Chances are any input you might receive for a real application would be in this format as well rather than applying conveniences such as $box.

So it can be done with the aggregation framework, or even parallel queries merging the "set" of results. But while the aggregation framework may do it better than merging sets of results externally, the best results will always come from computing the bounds first.

Neil Lunn
  • 148,042
  • 36
  • 346
  • 317
1

In case anyone else looks at this, as of mongo version 2.4, you can use $geoIntersects to find the intersection of GeoJSON objects, which supports intersections of two polygons, among other types.

{
  <location field>: {
     $geoIntersects: {
        $geometry: {
           type: "<GeoJSON object type>" ,
           coordinates: [ <coordinates> ]
        }
     }
  }
}

There is a nice write up on this blog.

Harry
  • 1,659
  • 5
  • 19
  • 34
  • 1
    Well no actually. The question asks to send "two or more polygons" and see if "those" shapes "insertsect" and contain "points" found within the "intersection of that result". `$geoInsersects` is for searching "using" a geometry object and finding if any "geometry object" in the "collection" actually "intersects a boundary" on the object issued in the query. These are two "completely" different things. – Blakes Seven Jul 13 '15 at 10:07