i have a web crawler application which store web page (in html format) into MongoDB and i want to do full text search in this database to get the desired web page using a query. I'm using mongoDB java driver to run full text query using this code:
BasicDBObject query = new BasicDBObject();
query.put("text", collectionName);
query.put("search", queryText);
query.put("limit", limit);
CommandResult queryResult = db.command(query);
the problem is, when i set the "limit" too high (for example 200) the queryResult always return 0. but if i change the "limit" into 130, the queryResult return as expected. My guess is the problem lies in the max BSON document size (16MB) so the queryResult will fail if the document size is too large.
here's my collection stats:
> db.web.stats();
{
"ns" : "web-crawler.web",
"count" : 12129,
"size" : 1622270432,
"avgObjSize" : 133751,
"storageSize" : 1952681984,
"numExtents" : 18,
"nindexes" : 2,
"lastExtentSize" : 511258624,
"paddingFactor" : 1,
"systemFlags" : 0,
"userFlags" : 1,
"totalIndexSize" : 566776672,
"indexSizes" : {
"_id_" : 400624,
"content_ftindex" : 566376048
},
"ok" : 1
}
any idea how to solve this problem?