1

I'm confused about the results given by Elasticsearch

I use a simple query for testing,

body: {
   size: 100,
   from: 0,
   query: { match_all: {} },
   fields: ["object"]    <--- this is an object

}

Pretty straight forward

I get hits. hits.total = 141380

But the hits.hits.length = 49

If I increase size to 1000, I get hits.hits.length = 129

while the hits.total is still hits.total = 141380

if I don't use fields, I get all docs, and in a readable format, but if i specify fields i get an array of array of objects of key arrays (yes complicated format for a search result!)

Can someone explain why it is different when using fields? I would expect the object with only the fields I requested.

MrE
  • 19,584
  • 12
  • 87
  • 105
  • is it possible that some of your documents don't have the `object` field? If out of 100 matching documents, only 49 have the `object` field, that would explain the results you're getting. – Val Mar 06 '16 at 05:40

1 Answers1

5

You need to use the Source filtering for getting the _source back with only the requested fields.

Just replace the fields in your query with _source as:

body: {
   size: 100,
   from: 0,
   query: { match_all: {} },
   _source: ["object"]    <--- this is an object

}

Here's why the results (hit counts) are different when you used the fields option:

The fields parameter is about fields that are explicitly marked as stored in the mapping, which is off by default and generally not recommended. For backwards compatibility, if the fields parameter specifies fields which are not stored (store mapping set to false), it will load the _source and extract it from it.

Also only leaf fields can be returned via the field option. So object fields can’t be returned and such requests will fail.

But if you search across multiple indices with an object field in the fields option and if this field is not present or mapped to a different datatype other than object (like string or long) in all or some of the indices, then such requests won't fail and will be returned in the hits array.

This is the reason why you get different values for hits.total and hits.hits.length

Output of a typical such query would look like:

{
  "took": 91,
  "timed_out": false,
  "_shards": {
    "total": 10,
    "successful": 9,
    "failed": 1,
    "failures": [
      {
        "shard": 1,
        "index": "test_index1",
        "node": "GQN77mbqTSmmmwQlmjSBEg",
        "reason": {
          "type": "illegal_argument_exception",
          "reason": "field [object] isn't a leaf field"
        }
      }
    ]
  },
  "hits": {
    "total": 25,
    "max_score": 1,
    "hits": [
      {
        "_index": "test_index2",
        "_type": "test_type1",
        "_id": "1",
        "_score": 1
      },
      {
        "_index": "test_index2",
        "_type": "test_type2",
        "_id": "1",
        "_score": 1
      },
      {
        "_index": "test_index2",
        "_type": "test_type3",
        "_id": "1",
        "_score": 1,
        "fields": {
          "object": [
            "simple text"    <-- here the field 'object' is a leaf field
          ]
        }
      }
    ]
  }
}

Here hits.total is the total no: of docs across all indices searched as it is a match all query.

And hits.hits.length is the no: of docs for which the request did not fail.

Vinu Dominic
  • 1,040
  • 1
  • 16
  • 25
  • do you happen to know why using an object in 'fields' does not lead the proper result? – MrE Mar 06 '16 at 07:37
  • Edited the answer to include explanation :-) – Vinu Dominic Mar 06 '16 at 08:02
  • @MrE: My apologies, there's a correction. Elasticsearch strictly allows only same mapping for fields with same name in different mapping types in the same index. Hence the field 'object' in 'test_index' cannot be at the same time 'string' and 'object'. I've edited the answer accordingly. – Vinu Dominic Mar 06 '16 at 09:41
  • thanks for the detailed explanation. i'm still a bit confused about the usefulness of 'fields' if _source does it more consistently for all types of keys. Most examples in the elastic docs use fields. i guess i need to understand what stored vs not stored means in ES. will have to do some more reading... – MrE Mar 06 '16 at 17:55
  • Here's a SO question on the same: [Stored field in elastic search](http://stackoverflow.com/questions/16663635/stored-field-in-elastic-search). If you're not satisfied with the explanation, please feel free to ask another question and link it here, will be glad to explain. – Vinu Dominic Mar 06 '16 at 18:39