I have Cassandra cluster with three nodes. We have data close to 7 TB from last 4 years. Now because of less space available in the server, we would like to keep data only for last 2 years. But we don't want to delete it completely(data older than 2 years). We want to keep specific data even it is older than 2 years. Currently I can think of one approach: 1) Java client using "MutationBatch object". I can get all the records key which fall into date range and excluding rows which we don't want to delete. Then deleting records in a batch. But this solution raises concern over performance as data is huge.
Is it possible to handle it at the server level(opscenter). I read about TTL but how can I apply it to an existing data and also restrict some of the data which I want to keep even if it is older than 2 years.
Please help me in finding out the best solution.