2

We have a hbase-0.94 cluster with hadoop-1.0.1. We don't want to have downtime for this cluster while upgrading to hbase-0.98 with hadoop-2.5.1

I have provisioned another hbase-0.98 cluster with hadoop-2.5.1 and want to copy hbase-0.94 tables to hbase-0.98. Hbase CopyTable does not seem to work for this purpose.

Please suggest a way to perform theabove task.

user2915097
  • 30,758
  • 6
  • 57
  • 59
Santosh Kumar
  • 761
  • 5
  • 28

2 Answers2

1

These are available options, out of which you can choose.

  1. You can use org.apache.hadoop.hbase.mapreduce.Export tool to export tables to HDFS and then you can use hadoop distcp to move data to another cluster. When data is place on second cluster you can use org.apache.hadoop.hbase.mapreduce.Import tool to import tables. Please look at http://hbase.apache.org/book.html#export.
  2. Second option is to us CopyTable tool, please look at: http://hbase.apache.org/book.html#copytable Have a look at pivotal

  3. Third option is to enable hbase Snapshots, create table snapshots, and then use ExportSnapshot tool to move them to second cluster. When snapshots are on second cluster you can clone tables from snapshots. Please look: http://hbase.apache.org/book.html#ops.snapshots

HBase Snapshots allow you to take a snapshot of a table without too much impact on Region Servers. Snapshot, Clone and restore operations don't involve data copying. Also, Exporting the snapshot to another cluster doesn't have impact on the Region Servers

I was using 1 and 3 for moving data between clusters and I in my case 3 was better solution.

Also, have a look at my answer posted

Community
  • 1
  • 1
Ram Ghadiyaram
  • 28,239
  • 13
  • 95
  • 121
  • Your suggestion is good but it is only valid if different clusters have compatible version of hadoop and base installed. In my case, there versions aren't compatible: hadoop-1.0.1 to hadoop-2.5.1 and hbase-0.94.1 to hbase-0.98. I have already tried option 1 and 2 but it doesn't work. – Santosh Kumar Jul 27 '16 at 18:45
  • The export was fast, the import is extremely slow, can I convert the exported data into HFile with the concept explained here: https://blog.cloudera.com/how-to-use-hbase-bulk-loading-and-why/ ? – ItayB May 29 '22 at 16:01
0

Run below command on source cluster, make sure you have cross cluster authentication enabled.

/usr/bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable -Ddfs.nameservices=nameservice1,devnameservice -Ddfs.ha.namenodes.devnameservice=devnn1,devnn2 -Ddfs.namenode.rpc-address.devnameservice.devnn1=<destination_namenode01_host>:<destination_namenode01_port> -Ddfs.namenode.rpc-address.devnameservice.devnn2=<destination_namenode02_host>:<destination_namenode02_port> -Ddfs.client.failover.proxy.provider.devnameservice=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider -Dmapred.map.tasks.speculative.execution=false --peer.adr=<destination_zookeeper host>:<port>:/hbase --versions=<n> <table_name>
Priyanshu
  • 885
  • 6
  • 12