0

Generally its believed that RDBMS scales better vertically while nosql are designed specifically to scale horizontally.

What kind of databases would be better fit for this kind of hardware http://www.dell.com/us/enterprise/p/dell-compellent-storage-center/pd.aspx

which can scale up to many hundred TBs.

As nosql databases are typically designed to be run on commodity servers then what should be the specs of that commodity server considering a very high load on database.

And what would be limit of RDBMS hardware node.. considering that enough RAM is available to have indexes (not dataset) in memory all the time.

While NOSQL has not hard requirement of indexes in RAM (but of course its recommended).

It would be also very interesting to know what are best fit for SAN devices? like http://www-03.ibm.com/systems/storage/disk/ds3500/index.html which can hold many hundred TBs at same place.

Gary Lindahl
  • 5,341
  • 2
  • 19
  • 18

1 Answers1

0

The decision whether to use NoSQL databases or to use Relational databases should come from your use case/application rather than your hardware. Analyze your use case, determine whether to use RDBMS or NoSQL Database (If NoSQL, then which NoSQL database to use is another crucial question, the answer to which will again depend on your use case). Then buy hardware which would be needed to run what you have decided.

The machine you link to can run RDBMS and NoSQL both, depepnding on what you need to do with your data. In case of MongoDB and HBase, commodity servers generally means 4-core machines with 16-24GB RAM. HBase starts showing good results only after you have at least 6-7 machines on your cluster. In case of other NoSQL DBs it might be different.

Hari Menon
  • 33,649
  • 14
  • 85
  • 108
  • While moving from RDBMS to NoSQL one has to sacrifice simplicity of SQL and Joins. And my use case of data is 80% writes and 20% reads. But database size can be in many dozen TBs so we are considering NoSQL just because of scalability issue otherwise RDBMS (MySQL) is best fit. – Gary Lindahl Sep 06 '11 at 19:29
  • What are the kind of reads? Do you need joins in your use case? – Hari Menon Sep 09 '11 at 08:08
  • I can sacrifice joins for about 50 extra bytes storage per row. At the moment join is required to a small tables with about only 3% and 12% of total data from larger table with all rest data. – Gary Lindahl Sep 09 '11 at 15:30
  • Do you need to do the joins in real-time? Are there Ad-hoc queries or the queries are known beforehand and can be pre-calculated? Can you re the larger table in plain text files, and use Hadoop for processing it and dump the processed data somewhere in some table? – Hari Menon Sep 09 '11 at 18:41
  • Well, if we go with joins then they will not be real time so they can be scheduled. Queries are know beforehand but only couple of parameters would be changed according to each record matching with joining tables. – Gary Lindahl Sep 09 '11 at 19:17