I'm running a Cassandra cluster with 5 nodes, each having 10 1Tb disks (JBOD). Currently, one of the nodes is in the problematic situation where a large compaction can no longer succesfully complete due to running out of disk space on a single disk.
I am trying to figure out what the effect will be of adding additional disks to the JBOD configuration.
- Will the existing data be redistributed automatically to utilize the new disk optimally?
- Will only new data be written to the newly added disks?
- Can i manually move sstables to different disks?
- Is splitting the sstables an option?
I found sources online which are not fully conclusive:
- https://stackoverflow.com/questions/23110054/cassandra-adding-disks-increase-storage-volume-without-adding-new-nodes seems to suggests the "data will even out between disks over time", but doesn't specify if that's due to rebalancing or the fact that new data will be written to the new disk only (also old link, so not sure if still relevant).
- http://mail-archives.apache.org/mod_mbox/cassandra-user/201610.mbox/%3cCAMy13tA3cZ++LaVnUsuwkwbR5tvBdhMEOqWij9nrWRODq42rLQ@mail.gmail.com%3e seems to imply that compactions will always run data disk local with Cassandra 3.2+.