2

We have the following ignite cluster setup configuration:

  • Apache Ignite version : 2.7.5
  • Ignite persistence is enabled (true)
  • 2 node cluster in partitioned mode
  • RAM - 210 GB per node 
  • JVM xms and xmx 20G
  • Off Heap Memory Max: 120GB
  • Number of records - 160 million 

I can see the following node metrics:

[03:13:31,126][INFO][db-checkpoint-thread-#146%GridA%][GridCacheDatabaseSharedManager] Checkpoint finished [cpId=df22db5b-6ffa-4f5d-b6da-d0e36c0492af, pages=1512, markPos=FileWALPointer [idx=6659, fileOff=249851578, len=49197], walSegmentsCleared=0, walSegmentsCovered=[], markDuration=26ms, pagesWrite=13ms, fsync=312ms, total=351ms]
[03:14:05,346][INFO][grid-timeout-worker-#67%GridA%][IgniteKernal%GridA] 
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=25a3a57c, name=GridA, uptime=16 days, 22:40:01.512]
    ^-- H/N/C [hosts=10, nodes=10, CPUs=172]
    ^-- CPU [cur=1.17%, avg=4.94%, GC=0%]
    ^-- PageMemory [pages=30333907]
    ^-- Heap [used=3889MB, free=81.01%, comm=20480MB]
    ^-- Off-heap [used=119880MB, free=2.68%, comm=123179MB]
    ^--   sysMemPlc region [used=0MB, free=99.99%, comm=99MB]
    ^--   metastoreMemPlc region [used=0MB, free=99.82%, comm=99MB]
    ^--   Default_Region region [used=119880MB, free=2.44%, comm=122880MB]
    ^--   TxLog region [used=0MB, free=100%, comm=99MB]
    ^-- Ignite persistence [used=253233MB]
    ^--   sysMemPlc region [used=0MB]
    ^--   metastoreMemPlc region [used=unknown]
    ^--   Default_Region region [used=253233MB]
    ^--   TxLog region [used=0MB]
    ^-- Outbound messages queue [size=0]
    ^-- Public thread pool [active=0, idle=0, qSize=0]
    ^-- System thread pool [active=0, idle=6, qSize=0]

Does the ignite node require restart or should page replacement trigger and free up some offheap space?

Edit-2: as you can see that off heap memory free space is ~ 2.5 % and still page replacement(PR) hasn't been triggered. Could not find anything on the topic as to when PR will be triggered. Will it be triggered at free space = 0% ? Is there a possibility that my ignite node would shutdown if free space reaches 0%? Any implications on query performance when page replacement triggers eventually?

User_Targaryen
  • 4,125
  • 4
  • 30
  • 51

1 Answers1

0

In case of enabled persistence after a data region is filled up Page replacement is triggered.

Pavel Vinokurov
  • 334
  • 1
  • 5
  • Pavel, as you can see that off heap memory free space is ~ 2.5 % and still page replacement(PR) hasn't been triggered. Could not find anything on the topic as to when PR will be triggered. Will it be triggered at free space = 0% ? Is there a possibility that my ignite node would shutdown if free space reaches 0%? Any implications on query performance when page replacement triggers? – User_Targaryen Dec 08 '20 at 03:20
  • The page replacement begins once you have no space left in memory to fit a record. For instance, if you insert a 1KB record and there is just 500 bytes of free space left in RAM, the replacement will get triggered. Your node won’t fail but the performance will be impacted because Ignite will start using disk actively to serve your requests. – dmagda Dec 08 '20 at 05:01
  • @dmagda: Thanks for the insight. We have been restarting the ignite node when we reach close to 5% free off heap space. Technically restart should be worse, right? [As off heap memory becomes empty and now for the next few hours all my queries would go to the disk and the off heap would get populated again slowly] So, is restart a better option or should we allow Page replacement to happen? – User_Targaryen Dec 08 '20 at 09:03
  • 1
    You can warm the memory up on restarts so that all or most queries are running over in-memory records. Anyway, the restarts are not solution if you are running out of memory. The page replacement allows Ignite using disk and keep you running when not enough memory is available. Take advantage of this. And, if the excessive disk usage starts impacting your SLAs, then scale out the cluster to allocate more memory and the page replacement will be halted. https://www.gridgain.com/resources/blog/out-of-memory-apache-ignite-cluster-handling-techniques – dmagda Dec 09 '20 at 15:03