4

I had created a collection in mongo db as show below

db.articles.insert([
 { _id: 1, subject: "one", author: "abc", views: 50 },
 { _id: 2, subject: "lastone", author: "abc", views: 5 },
 { _id: 3, subject: "firstone", author: "abc", views: 90  },
 { _id: 4, subject: "everyone", author: "abc", views: 100 },
 { _id: 5, subject: "allone", author: "efg", views: 100 },
 { _id: 6, subject: "noone", author: "efg", views: 100 },
 { _id: 7, subject: "nothing", author: "abc", views: 100 }])

after that I given text indexing to the field subject and author.

db.articles.createIndex(
    {subject: "text",
    author: "text"})

Now I am trying to search a word with "one" in indexed field. When I execute query ...

db.articles.count({$text: {$search: "\"one\""}})

... the result is 1.

The problem is that when I want combination of words "one", "abc" ...

db.articles.count({$text: {$search: "\"one\" \"abc\""}}

... it gives the result as 4. Including the records that contains the subject name as "lastone", "firstone", "everyone", "one" as the result.

So my question is that why the first query dosn't fetch 4 records? And how can I write a query that can fetch 4 records with word "one"?

glytching
  • 44,936
  • 9
  • 114
  • 120
Sameesh
  • 315
  • 8
  • 18

1 Answers1

4

This command ...

db.articles.count({$text: {$search: "\"one\""}})

... will count the documents having the exact phrase "one". There is only one such document, hence the result is 1.

Querying with the vaule "one" should only return on document since there is only one document containing either "one" or some value for which "one" is a stem. From the docs:

For case insensitive and diacritic insensitive text searches, the $text operator matches on the complete stemmed word. So if a document field contains the word blueberry, a search on the term blue will not match. However, blueberry or blueberries will match.

Looking at the documents in your question ...

  • one is not a stem of everyone
  • one is not a stem of lastone
  • one is not a stem of allone
  • one is not a stem of firstone
  • one is not a stem of noone

... so none of these documents will be matched for the value one.

You can, of course, query with multiple values. For example:

  • The docs suggest that this should be evaulated as one or abc and it correctly returns 5:

    db.articles.count({$text: {$search: "one abc"}})
    
  • The docs suggest that this should be evaulated as "abc" AND ("abc" or "one") and it correctly returns 5:

    db.articles.count({$text: {$search: "\"abc\" one"}})
    
  • The docs suggest that this should be evaulated as "one" AND ("one" or "abc") but it somehow returns 4:

    db.articles.count({$text: {$search: "\"one\" abc"}})
    

In the last example MongoDB includes the documents with subject in "one", "lastone", "firstone", "everyone" but excludes the document with subject "nothing". This suggest that it has somehow deemed "one" to be a stem of "lastone", "firstone" and "everyone" but when executing count({$text: {$search: "one"}}) it returns 1 which clearly indicates that one is not seen as a stem of "lastone", "firstone" and "everyone".

I suspect this might be a bug and might be worth raising with MongoDB.

FWIW, it's possible that what you actually want is a partial string search in which case $regex might work. The following query ...

db.articles.count({ subject: { $regex: /one$/ }, author: { $regex: /abc$/ } })

... means something like count where subject like '%one%' and author like '%abc%' and for your documents that returns 4 i.e. the documents where subject is one of "one", "lastone", "firstone", "allone", "everyone", "noone" and author is "abc".

glytching
  • 44,936
  • 9
  • 114
  • 120
  • I don't get that. `db.articles.count({author: "abc"})` returns 5 not 3 documents. Why you said that there are three documents with "abc" ? – mickl Apr 17 '18 at 20:47
  • @glytching I also find result of db.articles.count({author: "abc"}) as 5 as mickl told – Sameesh Apr 18 '18 at 05:14
  • `5` is the correct count for 'documents containing abc'. I have updated the answer. This does not change the explanation for why you are not matching more than one document containing "one". – glytching Apr 18 '18 at 06:54
  • @glytching my doubt is that why it gives 4 when I search with the keyword "\"one\" \"abc\""? if "one" only gives 1 result then the above case should return 1 rather than 4 right? – Sameesh Apr 18 '18 at 08:45
  • @Sameesh agreed, that is confusing. I have updated my answer in an attempt to address this. – glytching Apr 18 '18 at 09:15
  • @glytching when I execute db.articles.find({$text: {$search: "\"abc\"\"o\""}}) it gives 5 records, It perform partial searching effectivley. but db.articles.find({$text: {$search: "\"o\""}}) It wont give any records – Sameesh Apr 18 '18 at 14:45
  • This: `$search: "\"o\""` is an exact match on the term "o", there are no docs matching that. To be specific: "o" does not match "one". For more details read the docs on [phrases](https://docs.mongodb.com/manual/reference/operator/query/text/#phrases) and on [stemmed words](https://docs.mongodb.com/manual/reference/operator/query/text/#stemmed-words). – glytching Apr 18 '18 at 14:49