5

I'm trying to decipher the explain API in the elasticsearch response. But a bit lost. It's a bit hard to follow for me. Any simple pointers or links that will explain the JSON more specifically? I have an understanding of TF, IDF and the cosine similarity in the VSM. But need some pointers on the JSON more specifically. Ideal would be if I can find an explanation of this JSON as a simple mathematical expression.

{
  "_explanation": {
    "value": 7.937373,
    "description": "sum of:",
    "details": [
      {
        "value": 2.4789724,
        "description": "weight(FirstName:M80806 in 35) [PerFieldSimilarity], result of:",
        "details": [
          {
            "value": 2.4789724,
            "description": "score(doc=35,freq=1.0), product of:",
            "details": [
              {
                "value": 0.37350902,
                "description": "queryWeight, product of:",
                "details": [
                  {
                    "value": 6.6369815,
                    "description": "idf(docFreq=720, maxDocs=202323)"
                  },
                  {
                    "value": 0.056276944,
                    "description": "queryNorm"
                  }
                ]
              },
              {
                "value": 6.6369815,
                "description": "fieldWeight in 35, product of:",
                "details": [
                  {
                    "value": 1,
                    "description": "tf(freq=1.0), with freq of:",
                    "details": [
                      {
                        "value": 1,
                        "description": "termFreq=1.0"
                      }
                    ]
                  },
                  {
                    "value": 6.6369815,
                    "description": "idf(docFreq=720, maxDocs=202323)"
                  },
                  {
                    "value": 1,
                    "description": "fieldNorm(doc=35)"
                  }
                ]
              }
            ]
          }
        ]
      },
      {
        "value": 2.6825092,
        "description": "weight(FirstName:M8086 in 35) [PerFieldSimilarity], result of:",
        "details": [
          {
            "value": 2.6825092,
            "description": "score(doc=35,freq=1.0), product of:",
            "details": [
              {
                "value": 0.38854012,
                "description": "queryWeight, product of:",
                "details": [
                  {
                    "value": 6.9040728,
                    "description": "idf(docFreq=551, maxDocs=202323)"
                  },
                  {
                    "value": 0.056276944,
                    "description": "queryNorm"
                  }
                ]
              },
              {
                "value": 6.9040728,
                "description": "fieldWeight in 35, product of:",
                "details": [
                  {
                    "value": 1,
                    "description": "tf(freq=1.0), with freq of:",
                    "details": [
                      {
                        "value": 1,
                        "description": "termFreq=1.0"
                      }
                    ]
                  },
                  {
                    "value": 6.9040728,
                    "description": "idf(docFreq=551, maxDocs=202323)"
                  },
                  {
                    "value": 1,
                    "description": "fieldNorm(doc=35)"
                  }
                ]
              }
            ]
          }
        ]
      },
      {
        "value": 2.7758915,
        "description": "weight(FirstName:MHMT in 35) [PerFieldSimilarity], result of:",
        "details": [
          {
            "value": 2.7758915,
            "description": "score(doc=35,freq=1.0), product of:",
            "details": [
              {
                "value": 0.3952451,
                "description": "queryWeight, product of:",
                "details": [
                  {
                    "value": 7.0232153,
                    "description": "idf(docFreq=489, maxDocs=202323)"
                  },
                  {
                    "value": 0.056276944,
                    "description": "queryNorm"
                  }
                ]
              },
              {
                "value": 7.0232153,
                "description": "fieldWeight in 35, product of:",
                "details": [
                  {
                    "value": 1,
                    "description": "tf(freq=1.0), with freq of:",
                    "details": [
                      {
                        "value": 1,
                        "description": "termFreq=1.0"
                      }
                    ]
                  },
                  {
                    "value": 7.0232153,
                    "description": "idf(docFreq=489, maxDocs=202323)"
                  },
                  {
                    "value": 1,
                    "description": "fieldNorm(doc=35)"
                  }
                ]
              }
            ]
          }
        ]
      }
    ]
  }
}
user1189332
  • 1,773
  • 4
  • 26
  • 46
  • This will help you to understand the scoring if not pls let me know to explain the things.... https://lucene.apache.org/core/4_0_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html – Arun Prakash Oct 23 '15 at 15:53
  • This should also help http://www.lucenetutorial.com/advanced-topics/scoring.html – Val Oct 24 '15 at 03:23

1 Answers1

2

Using the Ruby gem elasticsearch-explain-response, you will get a more readable 'explanation', e.g.

require 'elasticsearch'
client = Elasticsearch::Client.new
result = client.explain index: "megacorp", type: "employee", id: "1", q: "last_name:Smith"
puts Elasticsearch::API::Response::ExplainResponse.new(result["explanation"]).render
#=>
1.0 = 1.0(fieldWeight)
  1.0 = 1.0(tf(1.0)) x 1.0(idf(2/3)) x 1.0(fieldNorm)
    1.0 = 1.0(termFreq=1.0)
Renaud
  • 16,073
  • 6
  • 81
  • 79