0

We have used multiple data directories in cassandra on EC2. One of the volume of data directory become 100 % full where as otherone was 30% empty. Post that a lot of writes got failed and eventually cassandra stopped. I debugged and realised my disk_failure_policy was set to stop. Which i later changed it to best_effort and tried starting the cassandra. As mentioned here best_effort if cassandra can't write to a disk, the disk will become blacklisted for writes. So, ideally the volume which was 100% full should have been blacklisted for the writes. But I was got while starting the cassandra that no disk space is left and it didn't start. So, what should we do in such case where we have multiple data directory and one of them becomes full. I am expecting better answers than just increase the size of the full volume.

Naresh
  • 5,073
  • 12
  • 67
  • 124
  • are there any snapshots that can be deleted? At least so cassandra will start. also, for temporary purposes, I have bind-mounted another volume on occasion to provide more disk space. – LHWizard Aug 28 '19 at 13:38

2 Answers2

1

One thing that I would check, would be the location of the commitlog. Under write-heavy circumstances with too high of a memtable_cleanup_threshold the commitlog can build-up to undesirable levels. In the old spinning-disk world, it an accepted practice to keep the commitlog on a different physical disk (for disk I/O throughput concerns). Anyway, I'd make sure that the commitlog isn't responsible for the disk footprint increase.

Cassandra will try to spread data across the data dirs evenly. That being said, if one is growing faster than another, you might be writing to a few partitions disproportionally higher than others. If that's the case, then you may want to look at your data model.

Otherwise, if the node is bricked and the dirs are lop-sided, IMO the best option would be to wipe it, and re-bootstrap it. Cassandra should spread the data evenly across the dirs on bootstrap.

Aaron
  • 55,518
  • 11
  • 116
  • 132
0

We have a system that uses multiple data directories. For the most part, Cassandra keeps things pretty evenly spread. However, if you have some Size Tiered Compaction tables that are large, you could run out of space during compaction. In general the data distribution should be close between the volumes as Cassandra tries to maintain that. But again, no guarantees of 100% equality. If you are running out of space, add another directory before things get "full" and halt, hoping cassandra can spread things out better before it's too late. At this point, you may have to take down the node, add another volume and move things around yourself to get it better balanced, then start it back up.

Jim Wartnick
  • 1,974
  • 1
  • 9
  • 19