1

So I'm getting a ton of data continuously that's getting put into a processedData collection. The data looks like:

{
    date: "2011-12-4",
    time: 2243,
    gender: {
        males: 1231,
        females: 322
    },
    age: 32
}

So I'll get lots and lots of data objects like this continually. I want to be able to see all "males" that are above 40 years old. This is not an efficient query it seems because of the sheer size of the data.

Any tips?

Shamoon
  • 41,293
  • 91
  • 306
  • 570
  • Can you give a bit more context, like how does that data come in MongoDB? Also, do you have a fixed set of those needed information (I.e. "all males above 40 years old"), or you want a system that can perform adhoc queries like that one? – matehat Nov 20 '11 at 20:19

2 Answers2

1

Generally speaking, you can't.

However, there may be some shortcuts, depending on actual requirements. Do you want to count 'males above 40' across all dataset, or just one day?

1 day: split your data into daily collections (processedData-20111121, ...), this will help your queries. Also you can cache results of such query.

whole dataset: pre-aggregate data. That is, upon insertion of new data entry, do something like this:

db.preaggregated.update({_id : 'male_40'},
     {$set : {gender : 'm', age : 40}, $inc : {count : 1231}},
     true);

Similarly, if you know all your queries beforehand, you can just precalculate them (and not keep raw data).

It also depends on how you define "real-time" and how big a query load you will have. In some cases it is ok to just fire ad-hoc map-reduces.

Alnitak
  • 334,560
  • 70
  • 407
  • 495
Sergio Tulentsev
  • 226,338
  • 43
  • 373
  • 367
0

My guess your target GUI is a website? In that case you are looking for something called comet. You should make a layer which processes all the data and broadcasts new mutations to your client or event bus (more on that below). Mongo doesn't enable real-time data as it doesn't emit anything on an mutation. So you can use any data store which suites you.

Depending on the language you'll use you have different options (for comet):

  • Socket.io (nodejs) - Javascript
  • Cometd - Java
  • SignalR - C#
  • Libwebsocket - C++

Most of the times you'll need an event bus or message queue to put the mutation events on. Take a look at JMS, Redis or NServiceBus (depending on what you'll use).

mark_dj
  • 984
  • 11
  • 29