58
for post in db.datasets.find({"test_set":"abc"}).sort("abc",pymongo.DESCENDING).skip((page-1)*num).limit(num):

How do I get the count()?

vvvvv
  • 25,404
  • 19
  • 49
  • 81
TIMEX
  • 259,804
  • 351
  • 777
  • 1,080

8 Answers8

58

Since pymongo version 3.7.0 and above count() is deprecated. Instead use Collection.count_documents. Running cursor.count or collection.count will result in following warning message:

DeprecationWarning: count is deprecated. Use Collection.count_documents instead.

To use count_documents the code can be adjusted as follows

import pymongo

db = pymongo.MongoClient()
col = db[DATABASE][COLLECTION]

find = {"test_set":"abc"}
sort = [("abc",pymongo.DESCENDING)]
skip = 10
limit = 10

doc_count = col.count_documents(find, skip=skip)
results = col.find(find).sort(sort).skip(skip).limit(limit)

for doc in result:
   //Process Document

Note: count_documents method performs relatively slow as compared to count method. In order to optimize you can use collection.estimated_document_count. This method will return estimated number of docs(as the name suggested) based on collection metadata.

Martin M.
  • 707
  • 7
  • 16
Sohaib Farooqi
  • 5,457
  • 4
  • 31
  • 43
  • 21
    This is an annoying change. Now, instead of making find request, and then `.count`ing the result Cursor, you need to make *two* calls, one to run count_documents, and one to get the actual results. This seems wasteful with complex filters (I don't know if it actually is). For small results sets `len(list(cursor))` could work, but that's going to be a terrible idea on a large result set... – naught101 Nov 02 '20 at 06:39
  • 5
    for copy pasters, using the var name `filter` will redefine python `filter` function, could result in unexpected behaviour of your code – toing_toing Jan 24 '21 at 00:08
51

If you're using pymongo version 3.7.0 or higher, see this answer instead.


If you want results_count to ignore your limit():

results = db.datasets.find({"test_set":"abc"}).sort("abc",pymongo.DESCENDING).skip((page-1)*num).limit(num)
results_count = results.count()

for post in results:

If you want the results_count to be capped at your limit(), set applySkipLimit to True:

results = db.datasets.find({"test_set":"abc"}).sort("abc",pymongo.DESCENDING).skip((page-1)*num).limit(num)
results_count = results.count(True)

for post in results:
thirtydot
  • 224,678
  • 48
  • 389
  • 349
  • @Jake: What do you suggest instead? – thirtydot Nov 07 '13 at 11:01
  • 1
    results_count = results.count(True) http://docs.mongodb.org/manual/reference/method/cursor.count/ I think I misread your post though. I stopped at the first for loop. I see now that you have that mentioned. Sorry for flying by the seat of my pants. – Jake Nov 08 '13 at 07:20
9

Not sure why you want the count if you are already passing limit 'num'. Anyway if you want to assert, here is what you should do.

results = db.datasets.find({"test_set":"abc"}).sort("abc",pymongo.DESCENDING).skip((page-1)*num).limit(num)

results_count = results.count(True)

That will match results_count with num

Nanda Kishore
  • 2,789
  • 5
  • 38
  • 61
4

Cannot comment unfortuantely on @Sohaib Farooqi's answer... Quick note: although, cursor.count() has been deprecated it is significantly faster, than collection.count_documents() in all of my tests, when counting all documents in a collection (ie. filter={}). Running db.currentOp() reveals that collection.count_documents() uses an aggregation pipeline, while cursor.count() doesn't. This might be a cause.

Max
  • 63
  • 5
2

if you wants all the records count(without any filter) in a collection then use this:

from pymongo import MongoClient
cl = pymongo.MongoClient(host="localhost", port=27017)
db = cl["database_name"]
print(db.get_collection("collection_name").estimated_document_count())
1

This thread happens to be 11 years old. However, in 2022 the 'count()' function has been deprecated. Here is a way I came up with to count documents in MongoDB using Python. Here is a picture of the code snippet. Making a empty list is not needed I just wanted to be outlandish. Hope this helps :). Code snippet here.

1

If you want to use cursor and also want count, you can try this way

# Have 27 items in collection
db = MongoClient(_URI)[DB_NAME][COLLECTION_NAME]

cursor = db.find()
count = db.find().explain().get("executionStats", {}).get("nReturned")
# Output: 27

cursor = db.find().limit(5)
count = db.find().explain().get("executionStats", {}).get("nReturned")
# Output: 5

# Can also use cursor
for item in cursor:
      ...

You can read more about it from https://pymongo.readthedocs.io/en/stable/api/pymongo/cursor.html#pymongo.cursor.Cursor.explain

rish_hyun
  • 451
  • 1
  • 7
  • 13
0

The thing in my case relies in the count of matched elements for a given query, and surely not to repeat this query twice:

one to get the count, and
two to get the result set.

no way

I know the query result set is not quite big and fits in memory, therefore, I can convert it to a list, and get the list length.

This code illustrates the use case:

# pymongo 3.9.0
while not is_over:
  it = items.find({"some": "/value/"}).skip(offset).size(limit)
  # List will load the cursor content into memory
  it = list(it)
  if len(it) < size:
    is_over = True
  offset += size
Evhz
  • 8,852
  • 9
  • 51
  • 69