IN map reduce concept under replica and over replica to use. how to balance the over replica and under replica.
Asked
Active
Viewed 3,591 times
1 Answers
1
I think you are aware that by default replication factor is 3.
Over-replicated blocks are blocks that exceed their target replication for the file they belong to. Normally, over-replication is not a problem, and HDFS will automatically delete excess replicas. Thats how its balanced in this case.
Under-replicated blocks are blocks that do not meet their target replication for the file they belong to.
To balance these HDFS will automatically create new replicas of under-replicated blocks until they meet the target replication.
You can get information about the blocks being replicated (or waiting to be replicated) using
hdfs dfsadmin -metasave.
if you execute below command, you will get the detailed stats.
hdfs fsck /
......................
Status: HEALTHY
Total size: 511799225 B
Total dirs: 10 Total files: 22
Total blocks (validated): 22 (avg. block size 23263601 B)
Minimally replicated blocks: 22 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 3.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 4
Number of racks: 1
The filesystem under path '/' is HEALTHY

user3190018
- 890
- 13
- 26

Ram Ghadiyaram
- 28,239
- 13
- 95
- 121
-
Please convey about 'Minimally replicated blocks' as well. – Dinesh Kumar P Jul 10 '17 at 09:28
-
replicas < minReplication ( UNDER MIN REPL'D BLOCKS )(HDFS-7537) replicas == minReplication ( Minimally replicated blocks ) replicas < ReplicationFactor ( Under-replicated blocks ) replicas == ReplicationFactor ( Normally replicated blocks ) replicas > ReplicationFactor ( Over-replicated blocks ) if ReplicationFactor equals to minReplication, the block is counted by both Minimally and Normally blocks. – ammills01 Jan 30 '19 at 14:59