How to migrate single-token cluster to a new vnodes cluster without downtime?

Question

We have Cassandra cluster with single token per node, total 22 nodes, average load per node is 500Gb. It has SimpleStrategy for the main keyspace and SimpleSnitch.

We need to migrate all data to the new datacenter and shutdown the old one without a downtime. New cluster has 28 nodes. I want to have vnodes on it.

I'm thinking of the following process:

Migrate the old cluster to vnodes
Setup the new cluster with vnodes
Add nodes from the new cluster to the old one and wait until it balances everything
Switch clients to the new cluster
Decommission nodes from the old cluster one by one

But there are a lot of technical details. First of all, should I shuffle the old cluster after vnodes migration? Then, what is the best way to switch to NetworkTopologyStrategy and to GossipingPropertyFileSnitch? I want to switch to NetworkTopologyStrategy because new cluster has 2 different racks with separate power/network switches.

pro tip: try this out in a testing/non-production enviroment first. — Schildmeijer, Mar 12 '13 at 22:18

score 5 · Accepted Answer · answered Mar 12 '13 at 15:40

should I shuffle the old cluster after vnodes migration?

You don't need to. If you go from one token per node to 256 (the default), each node will split its range into 256 adjacent, equally sized ranges. This doesn't affect where data lives. But it means that when you bootstrap in a new node in the new DC it will remain balanced throughout the process.

what is the best way to switch to NetworkTopologyStrategy and to GossipingPropertyFileSnitch?

The difficulty is that switching replication strategy is in general not safe since data would need to be moved around the cluster. NetworkToplogyStrategy (NTS) will place data on different nodes if you tell it nodes are in different racks. For this reason, you should move to NTS before adding the new nodes.

Here is a method to do this, after you have upgraded the old cluster to vnodes (your step 1 above):

1a. List all existing nodes as being in DC0 in the properties file. List the new nodes as being in DC1 and their correct racks.

1b. Change the replication strategy to NTS with options DC0:3 (or whatever your current replication factor is) and DC1:0.

Then to add the new nodes, follow the process here: http://www.datastax.com/docs/1.2/operations/add_replace_nodes#adding-a-data-center-to-a-cluster. Remember to set the number of tokens to 256 since it will be 1 by default.

In step 5, you should set the replication factor for DC0 to be 0 i.e. change replication options to DC0:0, DC1:3. Now those nodes aren't being used so decommission won't stream any data but you should still do it rather than powering them off so they are removed from the ring.

Note one risk is that writes made at a low consistency level to the old nodes could get lost. To guard against this, you could write at CL.LOCAL_QUORUM after you switch to the new DC. There is still a small window where writes could get lost (between steps 3 and 4). If it is important, you can run repair before decommissioning the old nodes to guarantee no losses or write at a high consistency level.

_writes made at a low consistency level to the old nodes could get lost_ Can you explain why? Don't know if it matters but we don't overwrite old values, we always add new data. This is how I see it: 1. DC0 and DC1 are up, clients are using DC0, data flow is DC0 -> DC1 2. Switch: clients are using DC1, data flow is DC0->DC1 and DC1->DC0 3. Old data is transferred to DC1, flow is just DC1->DC0 4. Set DC0:0, no data transfers between DC0 and DC1 Indeed I see that some reads from DC1 at step 2 may not see data that is still in DC0 only, but we can live with that. But it's not lost. — relgames, Mar 13 '13 at 14:13
During step 3, even though your client is connected to a node in DC1, it is not guaranteed to write a value to a node in DC1 at low consistency levels. E.g. if you write at CL.ONE, all replicas in DC1 may fail (e.g. due to being too busy so they drop the write) and the write only ends up on a node in DC0. When you set DC0:0 this write is lost even though it was acknowledged to the client. — Richard, Mar 13 '13 at 15:04
Any idea why datastax is saying don't do this anymore? http://datastax.com/documentation/cassandra/2.0/cassandra/configuration/configVnodesProduction_t.html - also here http://www.datastax.com/dev/blog/upgrading-an-existing-cluster-to-vnodes-2 — chrislovecnm, Dec 02 '14 at 20:23

score -1 · Answer 2 · answered Jul 22 '13 at 00:30

-1

If you are trying to migrate to a new cluster with vnodes, wouldn't you need to change the Partitioner. The documents say that it isn't a good idea to migrate data between different Partitioners.

answered Jul 22 '13 at 00:30

deepti.naidu

21
3

No, the partitioner choice is independent of vnodes. – Richard Dec 02 '14 at 22:41

How to migrate single-token cluster to a new vnodes cluster without downtime?

2 Answers2

Linked