0

I am bit confused with document counts for selected index in elastic search, below is the snippet of index

GET /_cat/indices/zipkin-span-2020-07-30?v

health status index                  uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   zipkin-span-2020-07-30 STcY29kkT3W7Y0XybbfVTQ   1   1     264996            0     88.9mb         88.9mb

It is showing document count is 264996 whereas it show very few records (MAX 20 records) when i hit the below request

GET /zipkin-span-2020-07-30/_search

{"took":774,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},
"hits":{"total":{"value":10000,"relation":"gte"},"max_score":1.0,"hits":[{.... records.....}]}}

Note - I also tried with scroll api but still showing the same output. Question -

  1. Does this issue related with duplicate records?
  2. Does this count consider replica records as well?
Morez
  • 2,048
  • 5
  • 36
  • 49
  • The `"hits":{"total":{"value":10000,"relation":"gte"}` is saying you have greater than 10,000 docs, which confirms the total that `_cat` is reporting of 264,996. 10,000 is the max for total hits for performance reasons. Elastic doesn't care if documents are identical or not, they are still separate documents – Nate Jul 30 '20 at 23:18
  • @Morez it has been a long time. Did you get a chance to go through my answer, looking forward to get feedback from you :) And if it helped you resolve your issue, then please don't forget to upvote and accept my answer :) – ESCoder Sep 24 '20 at 16:01

2 Answers2

4

@Bhavya's answer, @Nate's comment are perfect.

I will add little more.

_cat/indices - do not use to check counts. Doc-description

It doesn't say how many ES docs are there but says how many Lucene doc's are there. Here the difference is nested doc is considered as one Lucene doc. If one ES doc contains 5 nested doc, then _indices API will tell you that you have 5 docs.

To get an accurate count of Elasticsearch documents, use the cat count or count APIs.

Gibbs
  • 21,904
  • 13
  • 74
  • 138
  • I tried with count api as well but it is showing same count as it was showing for /_cat/indices. – Morez Aug 01 '20 at 13:35
  • If you want to see all the records you have, you set `size` parameter in search request. Is that your issue? – Gibbs Aug 01 '20 at 13:44
2

By default search request will count the total hits accurately up to 10,000 documents. If the total number of hits that match the query is greater than this value, the response will indicate that the returned value is a lower bound

Refer this official documentation to know more about this

By default Elastic returns 10 documents, if you want to increase the number of documents, add the size parameter in your query.

ESCoder
  • 15,431
  • 2
  • 19
  • 42