0

I have MongoDB collection with 50k+ documents like:

{
    "_id" : ObjectId("5a42190e806c3210acd3fa82"),
    "start_time" : ISODate("2017-12-25T02:29:00.000Z"),
    "system" : "GL",
    "region" : "NY",
    "order_id" : 3,
    "task_type" : "px2",
    "status" : false
}

When I add a new order, Python script searches database for existing orders with the same start_time and task_type, like:

    tasks_base.find_one({
    "$and": [{
        "start_time": plan_date.astimezone(pytz.UTC)
    }, {
        "task_type": px
    }]
    })

It works, but every new document in a collection slows it down (more documents to check, etc.).

As a solution, I want to add task_type and start_time as indexes for the collection. But have some concerns (date as index looks a bit unnatural). So, need an advice how to do it right (or additional ideas, how to speed up search). Grateful for any advice :)

sortas
  • 1,527
  • 3
  • 20
  • 29

1 Answers1

0

I solved it with 3 steps:

First, I created unique compound index:

tasks.create_index([("start_time", pymongo.ASCENDING,), ("task_proxy", pymongo.ASCENDING)], unique=True)

Then, I adjusted query to search only in index fields (covered query):

all_tasks = tasks.find({
        "$and": [{
            "start_time": {
                "$gte": plan_date.astimezone(pytz.UTC),
                "$lt": plan_date.astimezone(pytz.UTC) + timedelta(hours=1)
            }
        }, {
            "task_proxy": px
        }]
    }, {"_id": 0, "start_time": 1, "task_proxy": 1})

Last (same code above), I increased the size of time queries from 1 minute to 1 hour, so I can do 1 database operation instead of 60. I can operate with most of the data inside Python script, so load on the database is much lower :)

UPD: Rewrited 80% of code: I used 1 query for every order, now I use 1 query for every hour, find free time periods there, and pack orders in free cells (or move to another collection if not enough cells). Still using compound index and covered query, script exception time went from 15-17 seconds to 0.6.

sortas
  • 1,527
  • 3
  • 20
  • 29