2

Is there a smart way of deleting old files from the hdfs /tmp directory? (Just to make sure, I am not talking about the unix FS /tmp)

Istvan
  • 2,582
  • 3
  • 22
  • 29

2 Answers2

2

hadoop fs -stat "%Y" "/path/*" Will output timestamps of everything in /path/. Use that along with a cut off as to what you consider too young and you can have this clean up in a shell script kicked off by cron.

This might be smarter then parsing other things outputted by hadoop fs.

Dan R
  • 2,335
  • 2
  • 19
  • 28
0

Here's (the source code of) a small tool that does the job: https://github.com/mag-/hdfs-cleanup/

I might write one on my own (or port the given one to Python) so I don't need to create a build chain for Golang in my company.

And one more for Ruby users: https://github.com/nmilford/clean-hadoop-tmp