One of my data node has used 70% disk space while others only 30% percent. How can I migrate some of data from the 70% disk node to others. But I can not use HDFS rebalance, because Hbase is running on HDFS, the data rebalance may cause Hbase lose data locality.
Asked
Active
Viewed 526 times
-1
-
Are you using a customized version of Hadoop i.e. via CDH or Hortonworks etc.? or the Apache one? – Suvarna Pattayil Jun 04 '16 at 16:33
-
Maybe I can manually move the data to other data nodes, what do you think please? – Jack Jun 04 '16 at 16:52
-
I have not worked on HBase, but CDH Impala also uses data locality and it is stated in their docs that we need to invoke the `refresh` and `invalidate metadata` after a HDFS re balance is done to update the data locality. Isn't there a similar command for HBase? – Suvarna Pattayil Jun 05 '16 at 12:22
-
There are some valuable insights in this answer as well http://stackoverflow.com/questions/23686387/hadoop-and-hbase-rebalancing-after-node-additions – Suvarna Pattayil Jun 05 '16 at 12:29
2 Answers
1
tl;dr: You're asking a feature that is not yet part of HDFS.
There is a JIRA ticket HDFS-1312 for tracking the development effort. As your problem stated, the proposed datanode balancer intends to fix the issue that datanodes do not fill up disks evenly. Fortunately the feature is under active development and we can expect it be merged back to Hadoop release in months (not years).
In the JIRA link, there are two workarounds before the feature is released:
- Manually rebalancing blocks in storage directories
- Decomissioning nodes & later readding them
However, please do it manually only with great care.

Mingliang Liu
- 5,477
- 1
- 18
- 13
-
-
As the balancer loses data locality I think, it may not work in your case. The above answer is for rebalancing the disks (not datanodes). If you are to balance the data across datanodes and the data locality can be relaxed for your application (HBase), you can consider specifying datanodes to be balanced by the balancer using -include option. – Mingliang Liu Jun 06 '16 at 18:12
1
i think that your usable hard-disk format is same .if you want 70% data migrate then you are use partition method.
create hard disk partition with different format .
then mount hard disk and use as u wish .

vk_only
- 39
- 4