I have Cassandra cluster with three nodes. I want to perform clean up task on particular table which consist around 1 TB of data in each node.
Table consist multiple rows of subscriber with date_created(timestamp), I want to clear all rows of subscriber for which latest entry(date_created) is older than 7 Days.
For example - in sample attached above data we have to delete all rows subscriber 2 whereas all rows of subscriber 2 will be preserved if runs of 2019-12-10.
We have around 10 M of subscriber, In order to get total number of records select count(*) query is throwing below exception
ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting for replica nodes' responses] message="Operation timed out - received only 0 responses." info={'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}
What approach I should use to read Cassandra, since this cant be perform in memory as huge data of 3 TB.