Our application is running in a Cassandra Cluster of six nodes with two data centers.
Cluster information:
Cassandra version : 2.0.3
Snitch : GossipingPropertyFileSnith
Partitioner : Murmur3Partitioner
Each dc has three nodes.
Each dc has replication factor as 2.
Each node uses num_vnodes = 256. (all are virtual nodes)
DC1 is a live dc(local dc) which serves the data to the users currently. DC2 is just a back up dc(remote dc) which does not serve any data to the users. As we are planning for maintenance operations in DC1 alone, we are going to make remote dc DC2 to serve the users during the maintenance period.
During the outage, entire DC1 might be down for few days. Once the maintenance is done, we will again make DC1 to serve the data and DC2 for backing up. So we need to have up-to-date data in DC1 after outage. Our application will deal with huge data (few GBs) during the outage.
Before making DC1 down,
1) What are all the things (like commit-log settings,etc) need to be taken care in DC1 nodes
2) What are all the things (like hinted-handoff settings,etc) need to be taken care in DC2 nodes
During the outage,
3) When the entire DC1 is down, where the hints will be written (in any of the nodes of DC2?) and how to handle those hints?
After DC1 is up,
4) During the outage, replication might fail in DC1 nodes. How can we make/repair the DC1 with up-to-date data efficiently using DC2?