1

I bit confuse with Hadoop hive which i read from Wiki used for make OLAP. Now i want to make OLAP on Hive from OLTP database which use Mysql.

How i can solve this? can i use Kettle for make OLAP in Hive? any guidance how to make OLAP on Hive from OLTP mysql ?

Tks.

troya_adromeda
  • 647
  • 4
  • 15
  • 33

1 Answers1

0

I would suggest the following approach:
a) To specify the history part of your OLTP process. Usually it is some kind of logs over operations. Lets call it fact table.
b) To have fact table to be partitioned by time
c) Periodically offload oldest partition from the MySQL by exporting it to CSV and deleting from the MySQL.
e) Load this CSV file to Hive

By implementing this schema you will have all but latest data in the hive, and prevent growing of the MySQL OLTP database.

David Gruzman
  • 7,900
  • 1
  • 28
  • 30
  • Thanks for reply David, I have use Sqoop to import and Load database from RDBMS such as Mysql to Hive. but now i have problem on make reporting on pentaho with datasource from Hive :( I have post my problem on [http://stackoverflow.com/questions/7020565/create-datasource-hive-on-pentaho-hadoop](http://stackoverflow.com/questions/7020565/create-datasource-hive-on-pentaho-hadoop) But i haven't get the answer yet – troya_adromeda Aug 15 '11 at 06:15