2

my question is as follows, in the NoSQL family, when we store the values.

Should the value stored be totally ready for the application in order to use it directly? or is it also ok to store the value in a way that may need some extra treatment in order to be used for the application?.

Quick example, if we want to calculate an average value from the whole day, would it make more sense to always keep the average value stored or would it be better to maybe have some keys, representing each individual value and then, making the application calculate the average?.

The first approach would be faster once we get the value, but will be restricted to get the value from the whole day, while the second approach will be slower (we need to calculate the average each time) but we could also calculate average values per hour...

For me this is a question of philosophy, again, should be the data in the NoSQL database totally ready to be used, or makes sense to have the data in a way that needs some extra process.

Thanks a lot :)

Juan Antonio Gomez Moriano
  • 13,103
  • 10
  • 47
  • 65
  • Also, one consideration. If we want to store precalculated information for the average, then it would mean reading and writting all the time to the noSQL solution, assuming that noSQL usually provides no transactions, i would have no consistent data if two entities try to fetch the current average and then try to calculate the new one. – Juan Antonio Gomez Moriano Feb 08 '12 at 01:06

2 Answers2

3

Your considerations are not directly related to NoSQL - they are application-level issues to decide.

That said, there are specific NoSQL databases that are better than others in aggregations, performance-wise. Cassandra for example. Look for Hadoop+Cassandra solutions which leverage MapReduce to create aggregates.

Also see this similar question & answer: NoSQL databases - good candidates for log processing/aggregation and rollup?

Community
  • 1
  • 1
Ofer Zelig
  • 17,068
  • 9
  • 59
  • 93
0

In my understanding NOSQL philosophy tells to store precalculated values ready to use.
We can consider having it a bit stall, by, for example recalculating daily average each hour.
I can view NOSQL as RDBMS which gave up joins and scans and sweared to always access data by primary key. For this it was granted scalability. So it became simple by moving part of the complexity to the application layer. And thus I feel logical to have burden of maintaining averages on the application level.
We can also look from different perspective on the same question. Lets assume we have good group by capability on the NOSQL server side (which we do not..). It would be hard to ensure good quality of service for the "main" case of "by key" access. Even for mature RDBMS it is not easy task to ensure good performance on mixed OLAP-OLTP load...

David Gruzman
  • 7,900
  • 1
  • 28
  • 30