3

I am using python and mongodb. I have a collection which contains 40000 documents. I have a group of coordinates and I need to find which document these coordinates belong to. Now I am doing:

cell_start = citymap.find({"cell_latlng":{"$geoIntersects":{"$geometry":{"type":"Point", "coordinates":orig_coord}}}})

This method is a typical geoJSON method and it works well. Now I know some documents have such a field:

{'trips_dest':......}

The value of this field is not important so I just skip that. The thing is that, instead of looking for documents from all these 40000 documents, I can just look for documents from documents which have the field called 'trips_dest'.

Since I know only about 40% of documents have the field 'trips_dest' so I think this would increase the efficiency. However, I don't know how to modify my code to do that. Any idea?

gladys0313
  • 2,569
  • 6
  • 27
  • 51

1 Answers1

2

You need the $exists query operator. Something like that:

cell_start = citymap.find({"trips_dest": {$exists: true},
                           "cell_latlng":{"$geoIntersects":{"$geometry":{"type":"Point", "coordinates":orig_coord}}}})

To quote the documentation:

Syntax: { field: { $exists: <boolean> } }

When <boolean> is true, $exists matches the documents that contain the field, including documents where the field value is null

If you need to reject null values, use:

 "trips_dest": {$exists: true, $ne: null}

As a final note, a sparse index might eventually speed up such query.

Sylvain Leroux
  • 50,096
  • 7
  • 103
  • 125