2

I am using GridFs to save millions of files. When getting a list of all files, large result set causes Mongo to fail. When using python, I deal with this situation through using an empty filter for find():

client = MongoClient("192.168.0.13")
db = client.test
fs = gridfs.GridFS(db)
for f in fs.find():
    #..relevant python code

This approach works because I get a cursor from .find()

With Scala and Casbah, I could not find a way of doing this. No matter what I do, Mongo tries to perform some operation on the result set, exceeding its memory limits assigned to whatever particular operation it is performing. My scala test code is:

val mongoClient = MongoClient("192.168.0.13")
val db = mongoClient("test")

val gridfs = GridFS(db)
for(f <- gridfs) println(f.filename)

Running this code leads to :

Exception in thread "main" com.mongodb.MongoException: Runner error: Overflow sort stage buffered data usage of 33554552 bytes exceeds internal limit of 33554432 bytes

I just could not manage to obtain a cursor from Casbah for GridFs access. How do I do it?

mahonya
  • 9,247
  • 7
  • 39
  • 68

0 Answers0