Synchronization HBase tables between two clusters in SPARK

Asked Sep 09 '19 at 06:57

Active Sep 09 '19 at 23:00

Viewed 242 times

I want to write a tool that synchronize HBase tables between two environments. The tool should read data from the second cluster and update the table based on the timestamp.

I use hbase-client in version: 1.2.0-cdh5.12.1 and Spark version: 2.4.0-cdh6.1.1

I know copyTable (with timestamp parameters) Mapreduce solution but it seems to be slow.

Could anyone tell me if it's possible to speed up processing by using Spark framework?

edited Sep 09 '19 at 23:00

mazaneicha

8,794
4
33
52

asked Sep 09 '19 at 06:57

DonRoberto

Have you considered a co-processor use ? Not sure, if it helps. But, Delta backups and restore scheduled to execute via a shell script could be a simple solution. – Kris Sep 09 '19 at 07:00
1

Whats wrong with native HBase replication? http://hbase.apache.org/book.html#_cluster_replication – mazaneicha Sep 09 '19 at 22:58

Synchronization HBase tables between two clusters in SPARK

0 Answers0