0

I am developing an application for sales force. I am not able to figure out how to manage big data in my application. Following are the scenarios.

I have location divided based on following criteria.

Country => State => City => Territory => Area => Outlet.

My table structure to manage daily sales is roughly detailed below.

Outlet ID - 1,2,3,4,5,6 ...

User ID - EMP001,EMP002,EMP003,EMP004,EMP005,EMP006 ...

Product ID - 78,54,21,11,09,83 ..

Quantity - 12,34,67,43,70,03 ..

Date & Time - 01/05/2014 – 11.00,01/05/2014 – 12.00,01/05/2014 – 14.00 ..

and other filelds. Based on the above data structure there will be many reports which will be viewed on real-time basis.

We have 1 million row insertion on daily basis. I have narrowed on Casandra as the NO-SQL database.

Now i need a database which can query and mange real-time analytics. Heard and read about these Open Source tools like - Hbase,Pig, Hive, Presto DB, Impala, Sharp, Shark etc.

Currently i am not able to judge which is the best to go with my application for real-time analytics and forcasting product sale.

Your help and guidance will be highly appreciated.

Thanks

ChrisGPT was on strike
  • 127,765
  • 105
  • 273
  • 257

1 Answers1

1

Presto + Cassandra is good fit for you. Cassandra + Shark works as well.

alexliu68
  • 310
  • 1
  • 2