So we have a Neo4J instance in rancher using the docker image for 3.3. It had 1GB of heapspace and page cache but since this problem we have upped it to 2GB.
Everything runs smoothly for a while until we start getting this error:
2018-08-02 06:03:30.310+0000 INFO [o.n.k.i.t.l.c.CheckPointerImpl] Check Pointing triggered by scheduler for time threshold [1942]: Starting check pointing...
2018-08-02 06:03:30.311+0000 INFO [o.n.k.i.t.l.c.CheckPointerImpl] Check Pointing triggered by scheduler for time threshold [1942]: Starting store flush...
2018-08-02 06:03:30.313+0000 ERROR [o.n.k.i.t.l.c.CheckPointerImpl] Error performing check point java.io.IOException: I/O error
org.neo4j.kernel.impl.store.UnderlyingStorageException: java.io.IOException: I/O error
at org.neo4j.kernel.impl.index.labelscan.NativeLabelScanStore.force(NativeLabelScanStore.java:263)
at org.neo4j.kernel.impl.storageengine.impl.recordstorage.RecordStorageEngine.flushAndForce(RecordStorageEngine.java:460)
at org.neo4j.kernel.impl.transaction.log.checkpoint.CheckPointerImpl.doCheckPoint(CheckPointerImpl.java:160)
at org.neo4j.kernel.impl.transaction.log.checkpoint.CheckPointerImpl.checkPointIfNeeded(CheckPointerImpl.java:134)
at org.neo4j.kernel.impl.transaction.log.checkpoint.CheckPointScheduler$1.run(CheckPointScheduler.java:64)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
at org.neo4j.helpers.NamedThreadFactory$2.run(NamedThreadFactory.java:109)
Caused by: java.io.IOException: I/O error
at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:388)
at org.neo4j.io.fs.StoreFileChannel.force(StoreFileChannel.java:111)
at org.neo4j.io.pagecache.impl.SingleFilePageSwapper.force(SingleFilePageSwapper.java:711)
at org.neo4j.io.pagecache.impl.muninn.MuninnPagedFile.flushAndForceInternal(MuninnPagedFile.java:425)
at org.neo4j.io.pagecache.impl.muninn.MuninnPagedFile.flushAndForce(MuninnPagedFile.java:274)
at org.neo4j.index.internal.gbptree.GBPTree.checkpoint(GBPTree.java:901)
at org.neo4j.index.internal.gbptree.GBPTree.checkpoint(GBPTree.java:894)
at org.neo4j.kernel.impl.index.labelscan.NativeLabelScanStore.force(NativeLabelScanStore.java:259)
... 12 more
This error happens several times then we get this error:
2018-08-02 06:05:00.364+0000 ERROR [o.n.k.i.DatabaseHealth] Database panic: The database has encountered a critical error, and needs to be restarted. Please see database logs for more details. Error performing check point
org.neo4j.kernel.impl.store.UnderlyingStorageException: Error performing check point
at org.neo4j.kernel.impl.transaction.log.checkpoint.CheckPointScheduler$1.constructCombinedFailure(CheckPointScheduler.java:100)
at org.neo4j.kernel.impl.transaction.log.checkpoint.CheckPointScheduler$1.run(CheckPointScheduler.java:81)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
at org.neo4j.helpers.NamedThreadFactory$2.run(NamedThreadFactory.java:109)
Suppressed: org.neo4j.kernel.impl.store.UnderlyingStorageException: java.io.IOException: I/O error
at org.neo4j.kernel.impl.index.labelscan.NativeLabelScanStore.force(NativeLabelScanStore.java:263)
at org.neo4j.kernel.impl.storageengine.impl.recordstorage.RecordStorageEngine.flushAndForce(RecordStorageEngine.java:460)
at org.neo4j.kernel.impl.transaction.log.checkpoint.CheckPointerImpl.doCheckPoint(CheckPointerImpl.java:160)
at org.neo4j.kernel.impl.transaction.log.checkpoint.CheckPointerImpl.checkPointIfNeeded(CheckPointerImpl.java:134)
at org.neo4j.kernel.impl.transaction.log.checkpoint.CheckPointScheduler$1.run(CheckPointScheduler.java:64)
... 8 more
Caused by: java.io.IOException: I/O error
at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:388)
at org.neo4j.io.fs.StoreFileChannel.force(StoreFileChannel.java:111)
at org.neo4j.io.pagecache.impl.SingleFilePageSwapper.force(SingleFilePageSwapper.java:711)
at org.neo4j.io.pagecache.impl.muninn.MuninnPagedFile.flushAndForceInternal(MuninnPagedFile.java:425)
at org.neo4j.io.pagecache.impl.muninn.MuninnPagedFile.flushAndForce(MuninnPagedFile.java:274)
at org.neo4j.index.internal.gbptree.GBPTree.checkpoint(GBPTree.java:901)
at org.neo4j.index.internal.gbptree.GBPTree.checkpoint(GBPTree.java:894)
at org.neo4j.kernel.impl.index.labelscan.NativeLabelScanStore.force(NativeLabelScanStore.java:259)
... 12 more
Suppressed: org.neo4j.kernel.impl.store.UnderlyingStorageException: java.io.IOException: I/O error
at org.neo4j.kernel.impl.index.labelscan.NativeLabelScanStore.force(NativeLabelScanStore.java:263)
at org.neo4j.kernel.impl.storageengine.impl.recordstorage.RecordStorageEngine.flushAndForce(RecordStorageEngine.java:460)
at org.neo4j.kernel.impl.transaction.log.checkpoint.CheckPointerImpl.doCheckPoint(CheckPointerImpl.java:160)
at org.neo4j.kernel.impl.transaction.log.checkpoint.CheckPointerImpl.checkPointIfNeeded(CheckPointerImpl.java:134)
at org.neo4j.kernel.impl.transaction.log.checkpoint.CheckPointScheduler$1.run(CheckPointScheduler.java:64)
... 8 more
Caused by: java.io.IOException: I/O error
at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:388)
at org.neo4j.io.fs.StoreFileChannel.force(StoreFileChannel.java:111)
at org.neo4j.io.pagecache.impl.SingleFilePageSwapper.force(SingleFilePageSwapper.java:711)
at org.neo4j.io.pagecache.impl.muninn.MuninnPagedFile.flushAndForceInternal(MuninnPagedFile.java:425)
at org.neo4j.io.pagecache.impl.muninn.MuninnPagedFile.flushAndForce(MuninnPagedFile.java:274)
at org.neo4j.io.pagecache.impl.muninn.MuninnPagedFile.flushAndForce(MuninnPagedFile.java:262)
at org.neo4j.index.internal.gbptree.GBPTree.checkpoint(GBPTree.java:921)
at org.neo4j.index.internal.gbptree.GBPTree.checkpoint(GBPTree.java:894)
at org.neo4j.kernel.impl.index.labelscan.NativeLabelScanStore.force(NativeLabelScanStore.java:259)
... 12 more
Suppressed: org.neo4j.kernel.impl.store.UnderlyingStorageException: java.io.IOException: I/O error
at org.neo4j.kernel.impl.index.labelscan.NativeLabelScanStore.force(NativeLabelScanStore.java:263)
at org.neo4j.kernel.impl.storageengine.impl.recordstorage.RecordStorageEngine.flushAndForce(RecordStorageEngine.java:460)
at org.neo4j.kernel.impl.transaction.log.checkpoint.CheckPointerImpl.doCheckPoint(CheckPointerImpl.java:160)
at org.neo4j.kernel.impl.transaction.log.checkpoint.CheckPointerImpl.checkPointIfNeeded(CheckPointerImpl.java:134)
at org.neo4j.kernel.impl.transaction.log.checkpoint.CheckPointScheduler$1.run(CheckPointScheduler.java:64)
... 8 more
Caused by: java.io.IOException: I/O error
at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:388)
at org.neo4j.io.fs.StoreFileChannel.force(StoreFileChannel.java:111)
at org.neo4j.io.pagecache.impl.SingleFilePageSwapper.force(SingleFilePageSwapper.java:711)
at org.neo4j.io.pagecache.impl.muninn.MuninnPagedFile.flushAndForceInternal(MuninnPagedFile.java:425)
at org.neo4j.io.pagecache.impl.muninn.MuninnPagedFile.flushAndForce(MuninnPagedFile.java:274)
at org.neo4j.io.pagecache.impl.muninn.MuninnPagedFile.flushAndForce(MuninnPagedFile.java:262)
at org.neo4j.index.internal.gbptree.GBPTree.checkpoint(GBPTree.java:921)
at org.neo4j.index.internal.gbptree.GBPTree.checkpoint(GBPTree.java:894)
at org.neo4j.kernel.impl.index.labelscan.NativeLabelScanStore.force(NativeLabelScanStore.java:259)
... 12 more
Suppressed: org.neo4j.kernel.impl.store.UnderlyingStorageException: java.io.IOException: I/O error
at org.neo4j.kernel.impl.index.labelscan.NativeLabelScanStore.force(NativeLabelScanStore.java:263)
at org.neo4j.kernel.impl.storageengine.impl.recordstorage.RecordStorageEngine.flushAndForce(RecordStorageEngine.java:460)
at org.neo4j.kernel.impl.transaction.log.checkpoint.CheckPointerImpl.doCheckPoint(CheckPointerImpl.java:160)
at org.neo4j.kernel.impl.transaction.log.checkpoint.CheckPointerImpl.checkPointIfNeeded(CheckPointerImpl.java:134)
at org.neo4j.kernel.impl.transaction.log.checkpoint.CheckPointScheduler$1.run(CheckPointScheduler.java:64)
... 8 more
Caused by: java.io.IOException: I/O error
at sun.nio.ch.FileDispatcherImpl.force0(Native Method)
at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:388)
at org.neo4j.io.fs.StoreFileChannel.force(StoreFileChannel.java:111)
at org.neo4j.io.pagecache.impl.SingleFilePageSwapper.force(SingleFilePageSwapper.java:711)
at org.neo4j.io.pagecache.impl.muninn.MuninnPagedFile.flushAndForceInternal(MuninnPagedFile.java:425)
at org.neo4j.io.pagecache.impl.muninn.MuninnPagedFile.flushAndForce(MuninnPagedFile.java:274)
at org.neo4j.io.pagecache.impl.muninn.MuninnPagedFile.flushAndForce(MuninnPagedFile.java:262)
at org.neo4j.index.internal.gbptree.GBPTree.checkpoint(GBPTree.java:921)
at org.neo4j.index.internal.gbptree.GBPTree.checkpoint(GBPTree.java:894)
at org.neo4j.kernel.impl.index.labelscan.NativeLabelScanStore.force(NativeLabelScanStore.java:259)
... 12 more
After this the database is unreachable and we are forced to restart it for it to start working.
This happens at inconsistent times and we have yet to find a definitive reason for this or any posts similar to this.
Has anyone had this happen before or know what I could do to stop it from happening in the future?