Is there a smart way of deleting old files from the hdfs /tmp directory? (Just to make sure, I am not talking about the unix FS /tmp)
Asked
Active
Viewed 6,884 times
2 Answers
2
hadoop fs -stat "%Y" "/path/*"
Will output timestamps of everything in /path/. Use that along with a cut off as to what you consider too young and you can have this clean up in a shell script kicked off by cron.
This might be smarter then parsing other things outputted by hadoop fs.

Dan R
- 2,335
- 2
- 19
- 28
0
Here's (the source code of) a small tool that does the job: https://github.com/mag-/hdfs-cleanup/
I might write one on my own (or port the given one to Python) so I don't need to create a build chain for Golang in my company.
And one more for Ruby users: https://github.com/nmilford/clean-hadoop-tmp

Robert Jack Will
- 121
- 4