0

I have a test collection with these two documents:

{ _id: ObjectId("636ce11889a00c51cac27779"), sku: 'kw-lids-0009' }
{ _id: ObjectId("636ce14b89a00c51cac2777a"), sku: 'kw-fs66-gre' }

I've created a search index with this definition:

{
  "analyzer": "lucene.standard",
  "searchAnalyzer": "lucene.standard",
  "mappings": {
    "dynamic": false,
    "fields": {
      "sku": {
        "type": "string"
      }
    }
  }
}

If I run this aggregation:

[{
    $search: {
        index: 'test',
        text: {
            query: 'kw-fs',
            path: 'sku'
        }
    }
}]

Why do I get 2 results? I only expected the one with sku: 'kw-fs66-gre'

Stefano Sala
  • 907
  • 4
  • 7

2 Answers2

1

You're still getting 2 results because of the tokenization, i.e., you're still matching on [kw] in two documents. If you search for "fs66", you'll get a single match only. Results are scored based on relevance, they are not filtered. You can add {$project: {score: { $meta: "searchScore" }}} to your pipeline and see the difference in score between the matching documents.

If you are looking to get exact matches only, you can look to using the keyword analyzer or a custom analyzer that will strip the dashes, so you deal w/ a single token per field and not 3

rkiesler
  • 11
  • 2
0

During indexing, the standard anlyzer breaks the string "kw-lids-0009" into 3 tokens [kw][lids][0009], and similarly tokenizes "kw-fs66-gre" as [kw][fs66][gre]. When you query for "kw-fs", the same analyzer tokenizes the query as [kw][fs], and so Lucene matches on both documents, as both have the [kw] token in the index.

To get the behavior you're looking for, you should index the sku field as type autocomplete and use the autocomplete operator in your $search stage instead of text

  • 1
    Thanks Roy. I've changed the index definition to this: ``` { "analyzer": "lucene.standard", "searchAnalyzer": "lucene.standard", "mappings": { "dynamic": false, "fields": { "sku": { "type": "autocomplete" } } } } ``` And the query to this: ``` [{ $search: { index: 'test1', autocomplete: { query: 'kw-fs66', path: 'sku' } } }] ``` But still getting 2 results? – Stefano Sala Nov 10 '22 at 18:28