What is the difference between node.disk used v.s. index data/store size? How can index size total be bigger than disk used?
Asked
Active
Viewed 399 times
0
-
Can you back up your claims with some numbers coming from your specific use case? An index can definitely be larger than what a single disk on a single node can accept since an index is split into shards that are spread over several nodes. – Val Aug 07 '18 at 05:52
-
I am actually looking at monitoring tab via kibana on my cluster atm trying to make sense of the data. In the cluster, I have 90, 6-TB indexes for ~ 540TB. Under overview, it shows that indices data is ~ 535TB. But in the nodes section of the overview, it shows Disk Available: 90TB / 350TB. I am trying to understand how this is possible. If the index uses 540TB, but the disk available is shows 90/350TB....where are rest of indexes stored? I am trying to figure out how to calculate capacity needs. – Jettin Yeh Aug 07 '18 at 06:20
-
Do you need additional information? – Jettin Yeh Aug 07 '18 at 19:00
-
nvm, i figured it out. The value shown for disk available is actually not what it sounds like. The 350TB is only total available /unused and not actual total – Jettin Yeh Aug 08 '18 at 01:00
1 Answers
0
In ElasticSearch, store_size is the store size that is taken from primary and replica shards, while disk_used is the used disk space. Thus, node.disk_used represents the used disk space on the node, while store_size is finding the store size from the collection of documents. Within a node, you can declare multiple indexes. In relation to the second part of your question, this is an interesting overview on the problem you are having.

alephnerd
- 2,176
- 1
- 8
- 6
-
Thank you for the information. I understand that additional structure takes up space, and expect the size of the indexes to be bigger than the size of the original document. (especially since my use case leverages heavy/specialized analysis) What I am struggling to understand is, the space taken up by the index need to come from some disk. Each node has a set amount of disk. i would expect the storages needed to hold the indexes come from the node disks. So total used node disk space should be ~ index data size? In my case, I am seeing a much lower node_disk space than stored size. – Jettin Yeh Aug 07 '18 at 06:35
-
So I am trying to understand how/where the indexes are stored. Or does the stored size represent perhaps the uncompressed size, and the node.disked_used represent the compressed size? (and thus, a smaller value?) – Jettin Yeh Aug 07 '18 at 06:40