0

Definition

I'm creating searching application and mongo db is used to store searching information. This is example dataset of collection "Resource".

{
   _id:"5b3b84e02360a26f9a9ae96e",
   name:"Advanced Java",
   keywords:[
      "java", "thread", "state", "public", "void"
   ] 
},
{
   _id:"5b3b84e02360a26f9a9ae96f",
   name:"Java In Simple",
   keywords:[
      "java", "runnable", "thread", "sleep", "array"
   ]
}

This contains name of books and most frequent words (in keywords array) of each. I'm using spring framework with mongo template. If I run below code,

MongoOperations mongoOperations = new MongoTemplate(new MongoClient("127.0.0.1", 27017), "ResourceDB");
Query query = new Query(where("keywords").in("java", "thread", "sleep"));
List<Resource> resources = mongoOperations.find(query, Resource.class);

It results both "Advanced Java" and "Java In Simple" and its ok.

Problem

But in my case, I need them in order. Because "Java In Simple" match 3 words and "Advanced Java" matches only 2 words. So possibility of most relevant book should be "Java In Simple" and it should be in first.

Expecting Order

  • Java In Simple
  • Advanced Java

Is it possible to get result in matching order. Or is there any way to get number of matches for each item. For example If is search for ("java", "thread", "sleep"), I'm expecting output like below.

  • Advanced Java - 2 matches
  • Java in Simple - 3 matches

Any help appreciated.

Nishan256
  • 97
  • 7

2 Answers2

2

$in doesn't match 3 or 2 items. It stops after first match. You need to use aggregation pipeline to calculate intersection of keywords and the array from the query and order by size of the result:

db.collection.aggregate([
    { $addFields: {
        matchedTags: { $size: { 
            $setIntersection: [ "$keywords", [ "java", "thread", "sleep" ] ] 
        } }
    } },
    { $match: { matchedTags: { $gt: 0 } } },
    { $sort: { matchedTags: -1 } }
])
Alex Blex
  • 34,704
  • 7
  • 48
  • 75
1

This is for someone who looking to run @Alex Blex's query in java. It looks like mongo template does not have implementation for intersection. So I have done it using mongoDB java client.

List<String> keywords = Arrays.asList("java", "thread", "sleep");
BasicDBList intersectionList = new BasicDBList();
intersectionList.add("$keywords");
intersectionList.add(keywords);

AggregateIterable<Document> aggregate = new MongoClient("127.0.0.1", 27017).getDatabase("ResourceDB").getCollection("Resource").aggregate(
            Arrays.asList(
                    new BasicDBObject("$addFields",
                            new BasicDBObject("matchedTags",
                                    new BasicDBObject("$size",
                                            new BasicDBObject("$setIntersection", intersectionList)))),
                    new BasicDBObject("$match",
                            new BasicDBObject("matchedTags",
                                    new BasicDBObject("$gt", 0))),
                    new BasicDBObject("$sort",
                            new BasicDBObject("matchedTags", -1))
            )
    );
 MongoCursor<Document> iterator = aggregate.iterator();
 while (iterator.hasNext()){
        Document document = iterator.next();
        System.out.println(document.get("name")+" - "+document.get("matchedTags"));
 }
Nishan256
  • 97
  • 7