I am getting below exception on our production environment. After this exception we get serialization exception continuously.
Heuristic completion: outcome state is mixed; nested exception is org.springframework.data.gemfire.GemfireTransactionCommitException:
Unexpected failure on commit of Cache local transaction; nested exception is com.gemstone.gemfire.cache.CommitIncompleteException:
Incomplete commit of transaction TXId: c11pcsssvc64(eisCacheServer_c11pcsssvc64.dswh.ds.adp.com:63400)<v2>:4723:70384. Caused by the following exceptions:
From member: 100.99.18.94(eisCacheServer_c13pcsssvc696.dswh.ds.adp.com:77303)<v3>:23007 com.gemstone.gemfire.cache.query.IndexMaintenanceException:
com.gemstone.gemfire.cache.query.internal.index.IMQException, caused by com.gemstone.gemfire.cache.query.internal.index.IMQException
at com.gemstone.gemfire.internal.cache.LocalRegion.txApplyPutPart2(LocalRegion.java:5090)
at com.gemstone.gemfire.internal.cache.AbstractRegionMap.txApplyPut(AbstractRegionMap.java:3488)
at com.gemstone.gemfire.internal.cache.LocalRegion.txApplyPut(LocalRegion.java:5058)
at com.gemstone.gemfire.internal.cache.TXCommitMessage$RegionCommit.txApplyEntryOp(TXCommitMessage.java:1296)
at com.gemstone.gemfire.internal.cache.TXCommitMessage$RegionCommit$FarSideEntryOp.process(TXCommitMessage.java:1566)
at com.gemstone.gemfire.internal.cache.TXCommitMessage.basicProcessOps(TXCommitMessage.java:719)
at com.gemstone.gemfire.internal.cache.TXCommitMessage.basicProcess(TXCommitMessage.java:655)
at com.gemstone.gemfire.internal.cache.TXCommitMessage$CommitProcessMessage.basicProcess(TXCommitMessage.java:1737)
at com.gemstone.gemfire.internal.cache.TXCommitMessage$CommitProcessForLockIdMessage.process(TXCommitMessage.java:1657)
at com.gemstone.gemfire.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:305)
at com.gemstone.gemfire.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:368)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at com.gemstone.gemfire.distributed.internal.DistributionManager.runUntilShutdown(DistributionManager.java:692)
at com.gemstone.gemfire.distributed.internal.DistributionManager$4$1.run(DistributionManager.java:963)
at java.lang.Thread.run(Thread.java:745).,
After this data get corrupted from region and we get serialization exception when we hit any API which brings data from corrupted region
An IOException was thrown while deserializing; nested exception is com.gemstone.gemfire.SerializationException:
An IOException was thrown while deserializing ,Cause=org.springframework.dao.DataAccessResourceFailureException:
An IOException was thrown while deserializing; nested exception is com.gemstone.gemfire.SerializationException:
An IOException was thrown while deserializing
prod setup: we have 2 data nodes. Embedded tomcat gemfire node. Gemfire version is 7.0.2(we cannot upgrade to latest version) 2 locators 8 regions
After analysing we found that this might be happening while data sync up between 2 data nodes. Problem is we couldnt reproduce this issue on any lower environments. This only happens on prod intermittently. Does anyone has any idea about this issue?