How exactly to process data on Hadoop,Hive,Pig

Question

I have learnt the basics of Apache Hadoop Hive. And know majority of commands. Now, how to exactly work on the data. I have huge amt of data available with me(got it from a person). But dont know what exactly to do.

The data(.xlsx) is weekly sales, quarterly sales of a huge company (cant name it). The data is column wise sales of different products in different branches of US.

What processing can be done on this? Should I filter the data before doing that?

People usually know what is to be done and search for tools and techniques to do the same. Here its reverse. — Kedar Parikh, May 28 '15 at 05:36
Haha ;-) :-). Actly I am new to this tool.. Learnt the basic commands using small set of data. Now i want to apply it for bigger set. Can u pls tell atleast what normally is done. Any hint will be fine. — Sanjeev, May 28 '15 at 05:41

score 0 · Answer 1 · answered May 28 '15 at 05:48

0

You can try to find some interesting insights in the data like:
1) Largest selling products
2) Products that are often bought together
3) Branches with highest sale
4) Optimum inventory levels:
a. When inventory drops to Zero – potential sales are lost
b. Unused inventory, with no sales
5) Time of the year when certain products are more in demand

Sounds intresting? This is just the starting

answered May 28 '15 at 05:48

Kedar Parikh

1,241
11
18

http://hortonworks.com/hadoop-tutorial/how-to-process-data-with-apache-hive/ http://hortonworks.com/hadoop-tutorial/how-to-process-data-with-apache-pig/ – Kedar Parikh May 28 '15 at 05:52
That sounds really interesting. I will deftly try one by one.. Thanks a lot kedar. I will surely contact you back.. – Sanjeev May 28 '15 at 06:03

How exactly to process data on Hadoop,Hive,Pig

1 Answers1