0

Please let me know which tool is preferable to validate the data in data migration from RDB to Hadoop HDFS.

My requirement is to validate the data which is migrating from oracle to hadoop hdfs. the output is a flat file get stored into hadoop hdfs.

Marcel Gwerder
  • 8,353
  • 5
  • 35
  • 60
dileepvarma
  • 508
  • 2
  • 7
  • 30

1 Answers1

0

Is this a one time migration? Or should it run everyday and keep the data in Synch?

Raja
  • 63
  • 5
  • hi raja. its runs everyday – dileepvarma Nov 06 '13 at 16:14
  • You can try to write your own mapreduce by using Oracle's Bigdata connectors. You'll have better control on the data validation logic in this approach. Or you can use tools like Sqoop, Hive, Pig etc. More information : [move-data-from-oracle-to-hdfs](http://stackoverflow.com/questions/16890053/move-data-from-oracle-to-hdfs-process-and-move-to-teradata-from-hdfs) – Raja Nov 07 '13 at 05:51
  • I have to validate the data only after migration done, not migrate the data from oracle to hdfs. I have some basic knowledge on PIG, hive, Sqoop. My requirements is i have to compare the data from source to target. not sure which one among is best suitable.. any help in this is ? – dileepvarma Nov 08 '13 at 08:35