2

How is Cassandra's eventual consistency model different from HBase? It seems Facebook moved from Cassandra to HBase because consistency issues. Which of these NoSQL DBs are ideal for scale and performance with consistency as near as possible to 'immediate'. What is the factor by which performance degrades when we try to improve upon consistency?

vmr
  • 1,895
  • 13
  • 24

1 Answers1

2

Here's Facebook's original post on why they chose HBase for Messenger. At the time they decided HBase was "ideal for scale and performance with consistency as near as possible to 'immediate'", however they reached its limits and later developed a new service called Iris that handles the most recent week of messages, while storing the older messages in HBase.

Cassandra's consistency model provides a lot of flexibility. The biggest difference is that Cassandra is a shared-nothing architecture: each server is designed to be able to function independently, thus high availability and partition tolerance at the cost of consistency.

With HBase however there is a single source of truth, at the (apparent) cost of availability and partition tolerance. The read process, from the client's perspective, involves finding the location of that data and reading it from that server. Any updates to that data are atomic.

Here's one HBase vs Cassandra benchmark that shows HBase outperforming Cassandra on nearly every test on (mostly) default settings, and here's another benchmark that shows Cassandra outperforming HBase on certain tests. I think the conclusion here is that the answer to your question is highly dependent on your use case.

Here's a good article that sums up the plusses and minuses of each, and can help you decide which one is best for your needs.

Chris Leung
  • 143
  • 2
  • 10