We have less than 50GB of data for a table and we are trying to come up with a reasonable design for our Cassandra database. With so little data we are thinking of having all data on each node (2 node cluster with replication factor of 2 to start with).
We want to use Cassandra for easy replication - safeguarding against failover, having copies of data in different parts of the world and Cassandra is brilliant for that.
Moreover, best model that we currently came up with would imply that a single query (consistency level 1-2) would involve getting data from multiple partitions (avg=2, 90th %=20). Most of the queries would ask for data from <= 2 partitions but some might go up to 5k.
So my question here is whether it is really a problem? Is Cassandra slow to retrieve data from multiple partitions if we ensure that all the partitions are on the single node?