We are currently facing a problem of how to effectively store and retrieve data from very large data sets (into the billions). We have been using mysql and have optimized the system, OS, raid, queries, indexes etc, and are now looking to move on.
I need to make an informed decision about what technology to pursue to solve our data problems. I have been investigating map/reduce with HDFS, but also have heard good things about HBase. I can't help but think there are other options as well. Is there a good comparison of the technologies available and what the trade-offs of each are?
If you have links to share on each, I would appreciate that as well.