1

How do I remove all but the first n objects from a collection in MongoDB? For example, I only want to keep the first 2000 objects in my collection, but at the moment, there are 15000 objects.

EDIT: My question is more generalized than this related question. Not a duplicate.

Community
  • 1
  • 1
SC_Chupacabra
  • 12,907
  • 1
  • 15
  • 21
  • Related question [How to delete N numbers of documents in mongodb](http://stackoverflow.com/questions/19065615/how-to-delete-n-numbers-of-documents-in-mongodb). – chridam Nov 03 '15 at 14:33

2 Answers2

2

Have you considered capped collection with max parameter? https://docs.mongodb.org/manual/core/capped-collections/

db.createCollection("log", { capped : true, max : 2000 } );

If you really are just looking to delete all but 2000 newest objects, you could find the _id and remove everything $lt than _id.

jpaljasma
  • 1,612
  • 16
  • 21
  • At that point he would have something not fully manageable. Notice that you cannot delete documents from a capped collection! That would be a fixed size collection. – Eleanore Nov 03 '15 at 14:38
  • And you are right. However, user asked: `I only want to keep the first 2000 objects in my collection` - hence capped collection. – jpaljasma Nov 03 '15 at 14:40
  • Yes, I agree. Your solution works fine if then that collection is (allowed me the term) "dead", i.e., he won't modify its size. If he wants to do something more -after- the shrinking, then another solution may be needed. – Eleanore Nov 03 '15 at 14:43
  • In this case, Eleanore is right. The collection isn't set-in-stone so to speak, but I appreciate the extra suggestion (and I learned something in the process). – SC_Chupacabra Nov 03 '15 at 14:59
  • 1
    This is exactly what I was looking for here, I just didn't know the term. Thank you! – Robin Métral May 30 '19 at 17:31
2

You could select the IDs of the first N documents (that you want to keep):

var ids = [];
db.collection.find().limit(N).toArray().map(function(doc){
    ids.push(doc._id);
});

Then, you perform the following query:

db.collection.remove({_id:{$nin:ids}})

This removes every tuple whose id is NOT in the array ids. For further information about $nin (i.e., "not in") operator see this link.

Eleanore
  • 1,750
  • 3
  • 16
  • 33