17

My problem is that I have an app which is writing a lot of relatively (100-500kb) small CSV files (tens and hundreds of thousands ). Content of those files then get loaded in database via sql loader call (its oracle db) and this is what I have to live with.

So, I need to remove those small files time to time to prevent them from eating up all space. I would like to attach that to the activity which writes those files and loads them into db as a last finalize step.

My Question is -- how in java can one remove a bunch of small files with less overhead on performance?

Thanks in advance! Michael

Zorkus
  • 484
  • 1
  • 4
  • 13

5 Answers5

13

Well, file.delete() should suffice (it is internally implemented as a native method)

Bozho
  • 588,226
  • 146
  • 1,060
  • 1,140
  • additionaly the method delete() returns a boolean value if the file or directory is successfully deleted. – Agusti-N Jul 28 '10 at 18:09
  • I hope, but anyway will make a prototype and try out the timings. – Zorkus Jul 28 '10 at 18:12
  • You might also look into deleting a full directory with the method delete(). If this works, it may have some optimizations that make it much faster, but I tend to believe it will just fail. – Bill K Jul 28 '10 at 18:16
4

I'd suggest checking the Apache Commons IO library. They have some pretty helpful methods for deleting files in the FileUtils class.

ARKBAN
  • 3,419
  • 4
  • 24
  • 22
  • 1
    So i did testing of File.delete, and found it removes a file in like 0.5-1 millisecond in avg (tested on cookedup sets of 1000, 10000, 50000 of small files generated). Thanks everybody for ideas! it should work for me.. – Zorkus Jul 29 '10 at 17:06
3

You may find it an order of magnitude faster if you shell out and have the system delete them. You'd have to be able to hit a stopping point (where no files were being processed) then shell out and delete "*" or . or whatever it is for your OS.

(Note, this makes your program VERY os dependent!)

Be sure on Windows and Mac that you are bypassing the trashcan feature!

The nice thing about del . or rm * is that they SHOULD batch the operation rather than repeatedly opening, modifying and closing the directory.

You might also write filenames with a pattern like a001, a002, a003, ... and when you reach a999 you go to b001 and delete a*.

Bill K
  • 62,186
  • 18
  • 105
  • 157
  • Thats right too. I will thing that out further. Thanks a lot! – Zorkus Jul 28 '10 at 18:13
  • @Zorkus be sure to test and not just assume I'm right, I'm pretty sure I've tested this in the past and found it accurate, but it's probably very OS/Java Version/phase of the moon dependent. – Bill K Jul 28 '10 at 18:15
1
FileUtils.cleanDirectory(new File("/usr/share/test")); //linux

FileUtils.cleanDirectory(new File("C:\\test")); //windows
Chris
  • 18,075
  • 15
  • 59
  • 77
0

One can you use java.nio.file.Files's below method

delete(Path path)
deleteIfExists(Path path)

For more information refer this article

mcacorner
  • 1,304
  • 3
  • 22
  • 45