0

i have a web crawler application which store web page (in html format) into MongoDB and i want to do full text search in this database to get the desired web page using a query. I'm using mongoDB java driver to run full text query using this code:

BasicDBObject query = new BasicDBObject();
query.put("text", collectionName);
query.put("search", queryText);
query.put("limit", limit);
CommandResult queryResult = db.command(query);

the problem is, when i set the "limit" too high (for example 200) the queryResult always return 0. but if i change the "limit" into 130, the queryResult return as expected. My guess is the problem lies in the max BSON document size (16MB) so the queryResult will fail if the document size is too large.

here's my collection stats:

> db.web.stats();
{
        "ns" : "web-crawler.web",
        "count" : 12129,
        "size" : 1622270432,
        "avgObjSize" : 133751,
        "storageSize" : 1952681984,
        "numExtents" : 18,
        "nindexes" : 2,
        "lastExtentSize" : 511258624,
        "paddingFactor" : 1,
        "systemFlags" : 0,
        "userFlags" : 1,
        "totalIndexSize" : 566776672,
        "indexSizes" : {
                "_id_" : 400624,
                "content_ftindex" : 566376048
        },
        "ok" : 1
}

any idea how to solve this problem?

  • 1
    That is not how you run a query from Java. Use `DBCollection.find()` – Martin Feb 12 '15 at 12:13
  • Back up a a step and [familiarize yourself with the Java driver](http://docs.mongodb.org/ecosystem/tutorial/getting-started-with-java-driver/). – wdberkeley Feb 12 '15 at 15:40
  • honestly i'm following this [link](http://stackoverflow.com/questions/15879109/how-to-execute-full-text-search-command-in-mongodb-with-java-driver) to execute the full text query in mongoDB java driver. i have tried the DBCollection.find() method too. the difference between using command() and find() is the command() return sorted result by score but only top 100 as default so i don't need to sorting the result again. that's why i'm using command() instead of find() – Salman El Farisi Feb 13 '15 at 03:59

0 Answers0