1

I have a collection with geospatial data (2dsphere) and timestamps (date) which I need to query based on location (i.e. geospatial) and time (i.e. date).

I have created a compound index with geospatial and time as indicies. However when I perform the query (i.e. based on location & time) the queryplanner says it does a COLLSCAN.

Also, when I add an index with only time it uses that index (ie. IXSCAN with the timestamp index).

I'm a bit confused, I have read that you should be able to use 2dsphere and timestamp as a compounded index for fast finds based on location and time (e.g. MongoDB Indexing: Multiple single-field vs single compound?).

So I guess my question is whether it is possible to use 2dsphere as a compounded index? Does a 2dsphere index behave the same as other index types (i.e. ints, dates etc) or are there any big differences? For example, assume that we have the indicies "timestamp", "geospatial" (in that index order), Can I query only based on timestamp (since it is the first index)?

Link to the docs about 2dsphere: https://docs.mongodb.com/manual/core/2dsphere/.

Thanks in advance.

Example data:

{
    "_id" : ObjectId("5e300c9673f256100d5ca0b2"),
    "geometry" : {
        "type" : "Point",
        "coordinates" : [
            18.2,
            59.8
        ]
    },
    "properties" : {
        "@timestamp" : ISODate("2019-01-01T12:01:00.158Z")
    }
}
{
    "_id" : ObjectId("5e300c9673f256100d5ca0b3"),
    "geometry" : {
        "type" : "Point",
        "coordinates" : [
            12.3,
            52.4
        ]
    },
    "properties" : {
        "@timestamp" : ISODate("2019-01-01T12:10:00.158Z")
    }
}
{
    "_id" : ObjectId("5e300c9673f256100d5ca0b4"),
    "geometry" : {
        "type" : "Point",
        "coordinates" : [
            28.8,
            50.2
        ]
    },
    "properties" : {
        "@timestamp" : ISODate("2019-01-01T12:50:00.158Z")
    }
}
{
    "_id" : ObjectId("5e300c9673f256100d5ca0b5"),
    "geometry" : {
        "type" : "Point",
        "coordinates" : [
            28.8,
            50.2
        ]
    },
    "properties" : {
        "@timestamp" : ISODate("2019-01-01T13:50:00.158Z")
    }
}
{
    "_id" : ObjectId("5e300c9673f256100d5ca0af"),
    "geometry" : {
        "type" : "Point",
        "coordinates" : [
            -102.5,
            22.6
        ]
    },
    "properties" : {
        "@timestamp" : ISODate("2019-01-01T12:01:00.158Z")
    }
}
{
    "_id" : ObjectId("5e300c9673f256100d5ca0b0"),
    "geometry" : {
        "type" : "Point",
        "coordinates" : [
            -104.5,
            21
        ]
    },
    "properties" : {
        "@timestamp" : ISODate("2019-01-01T12:10:00.158Z")
    }
}
{
    "_id" : ObjectId("5e300c9673f256100d5ca0b1"),
    "geometry" : {
        "type" : "Point",
        "coordinates" : [
            -104.5,
            21
        ]
    },
    "properties" : {
        "@timestamp" : ISODate("2019-01-01T13:10:00.158Z")
    }
}

Index:

{
    "v" : 2,
    "key" : {
        "_id" : 1
    },
    "name" : "_id_",
    "ns" : "test_database.map_reduce_test_data"
},
{
    "v" : 2,
    "key" : {
        "properties.@timestamp" : -1,
        "geometry" : "2dsphere"
    },
    "name" : "properties.@timestamp_-1_geometry_2dsphere",
    "ns" : "test_database.map_reduce_test_data",
    "2dsphereIndexVersion" : 3
}

Query:

db.map_reduce_test_data.find({"properties.@timestamp": { $gt: new Date("2019-01-01T12:00:00Z"), $lt: new Date("2019-01-01T13:00:00Z") }})

Explain result:

{
    "queryPlanner" : {
        "plannerVersion" : 1,
        "namespace" : "test_database.map_reduce_test_data",
        "indexFilterSet" : false,
        "parsedQuery" : {
            "$and" : [
                {
                    "properties.@timestamp" : {
                        "$lt" : ISODate("2019-01-01T13:00:00Z")
                    }
                },
                {
                    "properties.@timestamp" : {
                        "$gt" : ISODate("2019-01-01T12:00:00Z")
                    }
                }
            ]
        },
        "winningPlan" : {
            "stage" : "COLLSCAN",
            "filter" : {
                "$and" : [
                    {
                        "properties.@timestamp" : {
                            "$lt" : ISODate("2019-01-01T13:00:00Z")
                        }
                    },
                    {
                        "properties.@timestamp" : {
                            "$gt" : ISODate("2019-01-01T12:00:00Z")
                        }
                    }
                ]
            },
            "direction" : "forward"
        },
        "rejectedPlans" : [ ]
    },
    "ok" : 1
}
user7641438
  • 85
  • 2
  • 7
  • 1
    Can you share a representative document, the query, and the `createIndex` command you used to set up the compound index? – Buzz Moschetti Jan 28 '20 at 08:17
  • I update the post to include the information requested. – user7641438 Jan 28 '20 at 10:44
  • 1
    Nice -- but could you also post the exact `find` or `aggregate` expression you used that provide the `COLLSCAN`? – Buzz Moschetti Jan 28 '20 at 14:52
  • Hi, update the question. Thanks for the replay. – user7641438 Jan 28 '20 at 15:47
  • 1
    Good news: I reproduced the issue. Bad news: Cannot explain why a compound index `{fld1:1, fld2:2dsphere` is not being used for query `fld`. – Buzz Moschetti Jan 31 '20 at 07:56
  • I tried to force it to use the compound index (by using .hint()), however the execution time is much longer compared to COLLSCAN. Do you think it has something to do with the underlying data-structure used by "regular" indices vs geospatial? As far as I understand "regular" indices uses B+-trees and geospatial uses R-trees, can these indices be combined? – user7641438 Jan 31 '20 at 08:04
  • I tried to change the query to use equality for Date, i.e. {"timestamp": new Date(....)} and it used the [timestamp, 2dsphere] index, so maybe it's inefficient to use range with 2dsphere index? Due to the underlying data-structures being used? – user7641438 Jan 31 '20 at 09:12
  • 1
    How many docs in the collection? I will rejigger my test to query a big collection (100,000 docs) instead of the 3 samples. – Buzz Moschetti Jan 31 '20 at 10:06
  • Ca 100 million. – user7641438 Jan 31 '20 at 10:08
  • 1
    I was using date equality all along -- and it did NOT use the `ts,2dsphere` index. Peculiar behavior. – Buzz Moschetti Jan 31 '20 at 10:47

0 Answers0