2

Elasticsearch (v7.9.2) got an api _cat/indices to show index status, the last change made to docs.count seems not visiable, until a search or another update is made.

Is this behaior for the purpose of performance improvement?

And, is there any way to make it always up to date?


@Update - How I obverse this?

I'm using logstash to import data into es. In the browser I have opened http://localhost:9200/_cat/indices?v.

After each import, I refresh the browser page, usually it changes.

After the logstash finish, and I terminate it, the count in the page is less than the count from source db (e.g mysql).

Then I refresh the page again and again, it won't change.

But, as I send a query request in postman to query the es index, then refresh again, the docs.count changed, the total count become the same as in the source db.

So, I'm summarizing following behavior:

  • At first, the docs.count do update after each import (aka. insert).
  • But, as importing continues for a while, without querying on the index, the page's docs.count stopped updating.
  • Then, a query on index will force docs.count update to the correct number.
  • After that, the above steps will repeat. It does look like some kind of delay until necessary optimization.

And, the index setting from http://localhost:9200/xxx/_settings:
(as requested from comment):

{
  "xxx" : {
    "settings" : {
      "index" : {
        "number_of_shards" : "1",
        "provided_name" : "xxx",
        "creation_date" : "1602844600812",
        "analysis" : {
          "analyzer" : {
            "default_search" : {
              "type" : "ik_max_word"
            },
            "default" : {
              "type" : "ik_max_word"
            }
          }
        },
        "number_of_replicas" : "0",
        "uuid" : "qLFMHhyBQNOOs1u_EcJbBg",
        "version" : {
          "created" : "7090299"
        }
      }
    }
  }
}
Eric
  • 22,183
  • 20
  • 145
  • 196
  • 1
    Not sure I understand, can you explain in more details what is happening? – Val Oct 21 '20 at 04:08
  • @Val I've updated the question, please check. – Eric Oct 21 '20 at 05:41
  • Can you back your claims with some numbers? How can I reproduce this? – Val Oct 21 '20 at 06:35
  • @Val Please check the `@Update` part in the question. – Eric Oct 21 '20 at 08:11
  • Can you share the settings of your index, i.e. what you get from `http://localhost:9200/your-index/_settings` ? – Val Oct 21 '20 at 08:12
  • @Val Added, I didn't do much setting except the `analyzer` and `number_of_replicas`. – Eric Oct 21 '20 at 08:17
  • @Val BTW, I'm using `v7.9.2`, might this be a new `feature` (LoL) or performance improvement ? – Eric Oct 21 '20 at 08:35
  • @EricWang can you explain your second point ie `If another update is made to the index, then the docs.count is update to reflect the previous update, but not the newest one.` with some example? – Amit Oct 21 '20 at 13:20
  • @ElasticsearchNinja I did more test on a fresh index. and updated the summary part with some correction. – Eric Oct 22 '20 at 02:57

2 Answers2

1

same issue on the ES version v7.9.3

from ES official docs:

To get an accurate count of Elasticsearch documents, use the cat count or count APIs

the cat count API is accurate on my ES cluster.

GET _cat/count/log-uwsgi-2021?v
epoch timestamp count
1638855942 05:45:42 500
Dongsheng
  • 11
  • 1
0

last doc.count will be shown when a refresh occurred. it will refresh periodic base on refresh.interval setting.

from documention: Elasticsearch periodically refreshes indices every second, but only on indices that have received one search request or more in the last 30 seconds.

hamid bayat
  • 2,029
  • 11
  • 20
  • 1
    It's not just 1s, if I don't do a query or do further update, the number won't update even after a minute. – Eric Oct 21 '20 at 07:57