How to use Sqoop incremental lastmodified if source table does not have timestamp column or any such date column and client is not allowing to make any changes in source table.Please Guide!
Asked
Active
Viewed 361 times
1 Answers
1
1. Your source table is never updated, just append
You fetch new inserted rows based on the primary autoincrement key by specifying the last row you integrated previously.
2. Your source table is both updated and inserted
In case your source table is also updated, the only way you have is to fetch the entire table and compare source & target with a hash function based on all columns. You can discover yourself the rows that have been modified by comparing all the columns using the hash()
function on the new table and the hive table
There is several way to update a hive table:
- merge (works better in hive2) and second part
- replace merge by two statement update/insert if you are using the stable hive 1.2.x version

parisni
- 920
- 7
- 20
-
Thank you! parisni for your reply but how to update data in hdfs which got updated in source table which does not have timestamp column. – user8167344 May 13 '18 at 08:44
-
I added some details. There is no other way IMO – parisni May 13 '18 at 09:32
-
Thank You very much Parisni.Now I really got idea of it. – user8167344 May 13 '18 at 14:35
-
nice. Then may you accept the answer so that I pass the 50 reputation – parisni May 13 '18 at 17:03
-
Sure Parisni.Thank You! – user8167344 May 14 '18 at 07:56