0

Below are sample records that I would normally insert into MySQL. I can then do the regular types of queries usig SQL. Note that I will have datetime in 5 minute intervals.

datetime          account_id country  zip   count 
2012-04-27 03:40   1234        69    91845   234
2012-04-27 03:45   3432        43    91813   212

I will be using simpledb with the python boto api.

Given that its a key value data store where the values can be stored as dictionary/json type objects, what is the proper structure to store the data so I can query? E.g. select sum(count) group by country.

Tampa
  • 75,446
  • 119
  • 278
  • 425
  • 1
    I am not getting your requirement. I think you would like to upload data from MySql to Amazon SimpleDB and query your data on Amazon SimpleDB. If so then you can upload your data directly from MySql to Amazon SimpleDB without JSON. You can get the value for each attribute from result set againt the query whose result you woould like to upload. – Ashish Pancholi Apr 27 '12 at 05:19

1 Answers1

0

SimpleDB only really supports count(*) aggregation, not sum.

You would either 1) need to do some hadoop processing to aggregate the results and return a result or 2) store and increment your aggregates in a separate document (I generally either add the logic close to my repository (like in an update method) or, for documents that require faster update/get routines, add a message to Amazon SQS and then recalculate those aggregates in a background service.

Truth told - I don't report much from SimpleDB, it is far easier to write a sync script that updates the data regularly in a relational database, then I can report from that, without worrying about resource contention with the front end application.

Thanks,

Hal

Hal
  • 1,229
  • 11
  • 26