How to clear "Reserved Space for Replicas" without restart Hadoop services

Question

How to find why "Reserved Space for Replicas" constantly increasing and how to limit space for this type of cache? We found that the "Reserved Space for Replicas" exceeds the Non DFS used space (last month). But we didn't find why :(

We know how to calculate "Non DFS" but the "Reserved Space for Replicas" shows a space size that doesn't correspond to actually occupied by "Non DFS". For example we have 6 TB volume:

"DFS Used" takes 4 TB
"Free space" on volume is 2 TB (this info getting by "df -h")
"Non DFS" takes 2 TB (why??? if "df -h" shows that we have 2 TB free space)

At the current time to free up the space allocated for this type of cache ("Reserved Space for Replicas"), we need to restart the datanode services. But in our opinion this is not the solution!

We use HDP v3.0.1, HDFS v3.1, Oracle JDK 8u181

zh_ · Answer 1 · 2020-06-12T03:53:22.297

For people who faced with that type of problem. First of all you should understand the nature of the problem. To do that please read the description of the following issues:

The following links will be useful to understand what block replica is:

Solutions

Find the wrong software which often broke connection with Hadoop during the write or append processes
Try to change replication policy (risky)
Update Hadoop up to late version

You can't reset “Reserved Space for Replicas” without restart Hadoop services!

How to clear "Reserved Space for Replicas" without restart Hadoop services

1 Answers1