I have an Hadoop cluster and within it I have a path that, when filled, causes a lot of problems. I want to write a script that checks the usage/capacity of the path for review on a semiweekly basis. My command and results:
$ hdfs dfs -df -h /my/fat/directory
Filesystem Size Used Available Use%
hdfs://TheServer 866.9 T 593.7 T 242.4 T 68%
Regardless of path provided I get the usage for the entire cluster rather than the considerably smaller directory I'm worried about.
How can I get the disk usage for my directory?
EDIT:
To clarify I want the capacity of the directory and the usage. Not just the usage. -du is not acceptable.