1

Schema looks like this:

  "mappings": {
    "_doc": {
      "_all": {
        "enabled": false
      },
      "properties": {
        "category_boost": {
          "type": "nested",
          "properties" : {
            "category": {
              "type": "text",
              "index": false
            },
            "boost": {
              "type": "integer",
              "index": false
            }
          }
        }
      }
    }
  }

The document in elastic does have data:

    "category_boost": [
        {
            "category": "A",
            "boost": 98
        },
        {
            "category": "B",
            "boost": 96
        },
        {
            "category": "C",
            "boost": 94
        },
    ],

Inside scoring function:

for (int i=0; i<doc['"'category_boost.boost'"'].size(); ++i) {
    if (doc['"'category_boost.category'"'][i].value.equals(params.category)) {
        boost = doc['"'category_boost.boost'"'][i].value;
    }
}

Also tried length to get size of the array, but did help. Since it does not affect results, I tried to divide by size() and it throws division by zero error, so I conclude the size is 0.

Overall problem: have a map of category->boost which is dynamic and I cannot hardcode into schema. I tried type object with json object, but it turned out you cannot access those objects in scoring functions, therefore I went with arrays with defined types.

Pierre Mallet
  • 7,053
  • 2
  • 19
  • 30

1 Answers1

2

nested datatype create sub-documents for representing the items of your collections. So access their doc values in a script is possible but you need to be inside a nested query.

Here is one way of doing it, I hope it fulfills your requirements. This example only returns the document with a score depending on the chosen category.

NB : I used elasticsearch 7 in my local, so your will have to modify the mapping to add your "_doc" entry etc....

Here is the modified mapping, I removed the index: false in nested properties since we now use them in queries

PUT test-score_nested
{
  "mappings": {
    "properties": {
      "category_boost": {
        "type": "nested",
        "properties": {
          "category": {
            "type": "keyword"
          },
          "boost": {
            "type": "integer"
          }
        }
      }
    }
  }
}

Then I add your sample data :

POST test-score_nested/_doc
{
  "category_boost": [
        {
            "category": "A",
            "boost": 98
        },
        {
            "category": "B",
            "boost": 96
        },
        {
            "category": "C",
            "boost": 94
        }
    ]
}

And then the query.

  1. We go one level deep in the nested collection
  2. Inside the collection we use a function score query with the replace mode
  3. Inside the function score, we use a filter query to "select" the good category and use its boost for the scoring

POST test-score_nested/_search
{
  "query": {
    "nested": {
      "path": "category_boost",
      "query": {
        "function_score": {
          "boost_mode": "replace", 
          "query": {
            "term": {
              "category_boost.category": {
                "value": "A"
              }
            }
          },
          "functions": [
            {
              "field_value_factor": {
                "field": "category_boost.boost"
              }
            }
          ]
        }
      }
    }
  }
}

returns

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 98.0,
    "hits" : [
      {
        "_index" : "test-score_nested",
        "_type" : "_doc",
        "_id" : "v3Smqm0BZ7nyeX7PPevA",
        "_score" : 98.0,
        "_source" : {
          "category_boost" : [
            {
              "category" : "A",
              "boost" : 98
            },
            {
              "category" : "B",
              "boost" : 96
            },
            {
              "category" : "C",
              "boost" : 94
            }
          ]
        }
      }
    ]
  }
}

I hope it will help you!

Pierre Mallet
  • 7,053
  • 2
  • 19
  • 30
  • is there specific reason for boost_mode replace or it will work with any boost mode ? – Akash Salunkhe Jan 25 '21 at 13:58
  • 1
    @AkashSalunkhe it will work with any boost mode but will mix the initial nested query score with the function score for other modes. Feel free to expriment :) – Pierre Mallet Jan 26 '21 at 11:05