1

I am trying to read some limited records after performing a match query on find() collection in mongoDb.

The query takes forever to process. In the next line, I try to read elements of the returned collection and that also takes a lot of time to complete.

Please help where I am making a mistake or scope of improvement.

robot_data = collection.find({"robot_uid": {"$eq":"12345"}},{"_id":0}).sort("_id",1).limit(500) 

 
for a in robot_data :
                print(f"print elements : {a}")    

Regards, Aarushi

MMA
  • 41
  • 5
  • how your document and collection looks like? Do you have index on the robot_uid field? what is the output from db.collection.getIndexes() ? Can you provide output from: db.collection.find({"robot_uid": {"$eq":robot_uid["robot_uid"]}},{"_id":0}).sort("_id",1).limit(500) .explain("executionStats") ? – R2D2 Feb 08 '22 at 19:04
  • @R2D2 It is a json array document with 20 json objects. _id exists but is a guid and does not include robot_uid. The python equivalent for _getIndexes()_ : _db.test.index_information()_ give me empty list *{}*. For the explain command I get the error _explain() takes 1 positional argument but 2 were given_ – MMA Feb 08 '22 at 19:36
  • @R2D2 Kindly also note that this piece of code takes long(for now, the execution took longer than 30 minutes and I had to interrupt the process) when executed inside a python file. However, it takes seconds when I use the python shell directly – MMA Feb 08 '22 at 19:38
  • 1
    I suspect `robot_uid` it is not indexed, and you are doing a document scan. I suggest creating an index for `robot_uid`. You are also omitting the id from the projection, but still trying to sort by that id, which seems a bit odd to me. Also, this is the equivalent to your query: `{"robot_uid" : "12345" }`. You don't actually need $eq. – DavidA Feb 08 '22 at 20:56
  • Any difference in performance if you use `robot_data = collection.find({"robot_uid":"12345"}, sort=[("_id", pymongo.ASCENDING)], limit=500)`? – rickhg12hs Feb 08 '22 at 21:38
  • ... or `robot_data = collection.aggregate([{"$match": {"robot_uid": "12345"}}, {"$sort": {"_id": 1}}, {"$limit": 500}])`? [mongoplayground.net example](https://mongoplayground.net/p/XdLxoKekDjj "Click me!") – rickhg12hs Feb 09 '22 at 02:10
  • @rickhg12hs I tried all combinations but no improvement. I am trying to make robot_uid an index – MMA Feb 09 '22 at 05:27
  • "However, it takes seconds when I use the python shell directly" Are you sure the bottleneck is with `pymongo`? Weird that using a shell is so much faster? Is there anything else going on in the script file? – rickhg12hs Feb 09 '22 at 07:04

1 Answers1

0

The answer that worked out for me with everyone's help here was to make the search field robot_uid into an index.

The search is extremely fast now.

Thank you! Aarushi

MMA
  • 41
  • 5