We are running a 2-client 3-server cluster. We keep seeing OOM even as we provide more heap. Below is the heap dump snapshot, any idea why we see so many CacheEvent objects, what is really happening here?
Looking at the source for 'GridSelectorNioSessionImpl', seems it has an unbounded queue that is accumulating WriteRequests. Any idea why are these not being flushed? I was reading about setting messageQueue, even the ignite logs warns about it possibly resulting into an OOM, since its unbounded. But i could not correlate that setting to the queue initialization happening in 'GridSelectorNioSessionImpl'.
Below is the thread stack which resulted in the OOM,
"sys-#300%ip-192-168-4-45_us-west-2_compute_internal.cache-2%" prio=5 tid=367 RUNNABLE
at java.lang.OutOfMemoryError.<init>(OutOfMemoryError.java:48)
at org.apache.ignite.internal.binary.streams.BinaryHeapOutputStream.arrayCopy(BinaryHeapOutputStream.java:101)
local variable: org.apache.ignite.internal.binary.streams.BinaryHeapOutputStream#1
at org.apache.ignite.internal.binary.BinaryWriterExImpl.array(BinaryWriterExImpl.java:239)
at org.apache.ignite.internal.binary.GridBinaryMarshaller.marshal(GridBinaryMarshaller.java:253)
local variable: org.apache.ignite.internal.binary.BinaryWriterExImpl#1
at org.apache.ignite.internal.binary.BinaryMarshaller.marshal0(BinaryMarshaller.java:84)
at org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.marshal(AbstractNodeNameAwareMarshaller.java:56)
local variable: org.apache.ignite.internal.binary.BinaryMarshaller#1
local variable: java.lang.String#77501
at org.apache.ignite.internal.util.IgniteUtils.marshal(IgniteUtils.java:10824)
at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.sendWithRetries(GridContinuousProcessor.java:1953)
local variable: org.apache.ignite.internal.processors.continuous.GridContinuousProcessor#1
local variable: java.util.Collections$SingletonList#198
local variable: org.apache.ignite.internal.processors.continuous.GridContinuousMessage#72196
local variable: org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$9#72196
local variable: org.apache.ignite.internal.processors.continuous.GridContinuousMessage#72196
at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.sendWithRetries(GridContinuousProcessor.java:1934)
at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.sendWithRetries(GridContinuousProcessor.java:1916)
at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.sendNotification(GridContinuousProcessor.java:1321)
at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.addNotification(GridContinuousProcessor.java:1258)
at org.apache.ignite.internal.GridEventConsumeHandler$2$1.run(GridEventConsumeHandler.java:250)
local variable: org.apache.ignite.internal.GridEventConsumeHandler$2$1#1
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
local variable: org.apache.ignite.thread.IgniteThreadPoolExecutor#53
local variable: java.util.concurrent.ThreadPoolExecutor$Worker#47
local variable: org.apache.ignite.thread.IgniteThread#78
local variable: org.apache.ignite.internal.GridEventConsumeHandler$2$1#1
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
local variable: java.util.concurrent.ThreadPoolExecutor$Worker#47
at java.lang.Thread.run(Thread.java:834)
The complete thread dump off the OOM is available here https://easyupload.io/fs3y8x
TIA