0

apache cassandra 3.11.2, 12 nodes, got this error in one node when running nodetool cleanup:

ERROR [CompactionExecutor:4996] 2019-04-10 16:09:43,938 CassandraDaemon.java:228 - Exception in thread Thread[CompactionExecutor:4996,1,main]
java.lang.AssertionError: null
        at org.apache.cassandra.io.compress.CompressionMetadata$Chunk.<init>(CompressionMetadata.java:474) ~[apache-cassandra-3.11.2.jar:3.11.2]
        at org.apache.cassandra.io.compress.CompressionMetadata.chunkFor(CompressionMetadata.java:239) ~[apache-cassandra-3.11.2.jar:3.11.2]
        at org.apache.cassandra.io.util.MmappedRegions.updateState(MmappedRegions.java:163) ~[apache-cassandra-3.11.2.jar:3.11.2]
        at org.apache.cassandra.io.util.MmappedRegions.<init>(MmappedRegions.java:73) ~[apache-cassandra-3.11.2.jar:3.11.2]
        at org.apache.cassandra.io.util.MmappedRegions.<init>(MmappedRegions.java:61) ~[apache-cassandra-3.11.2.jar:3.11.2]
        at org.apache.cassandra.io.util.MmappedRegions.map(MmappedRegions.java:104) ~[apache-cassandra-3.11.2.jar:3.11.2]
        at org.apache.cassandra.io.util.FileHandle$Builder.complete(FileHandle.java:362) ~[apache-cassandra-3.11.2.jar:3.11.2]
        at org.apache.cassandra.io.sstable.format.big.BigTableWriter.openEarly(BigTableWriter.java:290) ~[apache-cassandra-3.11.2.jar:3.11.2]
        at org.apache.cassandra.io.sstable.SSTableRewriter.maybeReopenEarly(SSTableRewriter.java:179) ~[apache-cassandra-3.11.2.jar:3.11.2]
        at org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:134) ~[apache-cassandra-3.11.2.jar:3.11.2]
        at org.apache.cassandra.db.compaction.CompactionManager.doCleanupOne(CompactionManager.java:1137) ~[apache-cassandra-3.11.2.jar:3.11.2]
        at org.apache.cassandra.db.compaction.CompactionManager.access$400(CompactionManager.java:86) ~[apache-cassandra-3.11.2.jar:3.11.2]
        at org.apache.cassandra.db.compaction.CompactionManager$6.execute(CompactionManager.java:475) ~[apache-cassandra-3.11.2.jar:3.11.2]
        at org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:322) ~[apache-cassandra-3.11.2.jar:3.11.2]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_191]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_191]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_191]
        at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) [apache-cassandra-3.11.2.jar:3.11.2]
        at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_191]
ERROR [Reference-Reaper:1] 2019-04-10 16:10:01,613 Ref.java:224 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@6bfdb59e) to class org.apache.cassandra.io.util.FileHandle$Cleanup@1698322065:/data/cassandra/data_files/path/mc-135-big-Index.db was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2019-04-10 16:10:01,626 Ref.java:224 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@6105deee) to class org.apache.cassandra.io.util.MmappedRegions$Tidier@604462250:/data/cassandra/data_files/path/mc-135-big-Data.db was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2019-04-10 16:10:01,626 Ref.java:224 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@22504cf1) to class org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$Tidy@359209204:[Memory@[0..44), Memory@[0..4c8)] was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2019-04-10 16:10:01,627 Ref.java:224 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@5ad45619) to class org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1143514085:Memory@[7eaa3bdf4000..7eaa3bdf7200) was not released before the reference was garbage collected

what do we need to do? replace this node with new node? what this error mean? we run the cleanup after nodetool repair successfully completed in all our nodes.

evyatars
  • 81
  • 5

1 Answers1

0

This has been an issue since early versions of Cassandra and has been already reported to the dev team. The condition seems to be innocuous, but if you find out that it only occurs on the same node, you may need to replace it.

Carlos Monroy Nieblas
  • 2,225
  • 2
  • 16
  • 27
  • It happened in 2 nodes in the cluster, there is a way to disconnect those nodes and wipe & resync the data from begining? this will help? or better to take new nodes, add them to the cluster and after done drop those 2 problematic nodes? Thanks. – evyatars Apr 12 '19 at 10:25
  • before going through a whole replacement, have you tried a repair on the cluster? executing `nodetool repair` one node at a time will ensure consistency through all the data – Carlos Monroy Nieblas Apr 12 '19 at 15:33
  • We run nodetool cleanup after nodetool repair done in all nodes (one node at a time) in our cluster. repair done without errors.... nodetool repair didn't fix this issue that appear when we run nodetool cleanup. Anyway, we will add 2 new nodes first and run again repair and then remove the problematic nodes. after remove we will run nodetool repair again. Does it sound okay? Thanks! – evyatars Apr 12 '19 at 19:14
  • yes, it may fix your issue; just be sure to also run `nodetool cleanup` one node at a time once that the replacement of the nodes is finished, as there are going to be changes in the ownership of the data in your nodes. – Carlos Monroy Nieblas Apr 12 '19 at 19:19
  • Thank! after replace 2 nodes this error did not come back on 2 new nodes, but appear in one other node in the cluster that not appear before...maybe cassandra bug? – evyatars Apr 30 '19 at 12:46