1

We are now planning to launch a Cassandra cluster on AWS EC2. For the minimum, we plan to launch two nodes, each on a different availability zone (AZ) of the same region (us-east-1) to have better fault tolerance. But cross-AZ data transfer is $0.02 per GB AWS cross-az traffic. This brings me the question of how much data is transferred per month for nodes to communicate periodically (gossip), thus to estimate the cost associated. I just don't want to be caught up in surprise when the bill comes out.

Suppose there are only two nodes, each one in a different AZ, and suppose there is no client Read/Write at all (I know how to estimate that part), how much data will they transfer per month for gossip? When the cluster grows, how will it grow? Will it grow in O(N^2)?

Community
  • 1
  • 1
Dichen
  • 358
  • 1
  • 2
  • 12
  • I'm still not sure about the answer to this question, but it turns out this is not important at all: the cost of data transfer is much cheaper than the cost of keeping instances running. – Dichen Jun 02 '16 at 21:16

1 Answers1

0

You would probably not need to care about the gossip data cost. Don't know exactly how much data gossip will transfer. The gossip data are just the heartbeats. Expect they would be very small, comparing with the application data. You would have like 3 nodes on 3 availability zones with replication factor as 3. So data will be replicated to all 3 AZs, to tolerate the single AZ failure. This means, when you insert 1GB data, there is 1GB data transfer cross AZ.

Plus, as you said, The EC2 & EBS cost will be much higher.

CloudStax
  • 649
  • 5
  • 6