3

I am looking for the most efficient way to move a directory recursively in Java. At the moment, I am using Apache commons-io as shown in the code below. (If the destDir exists and contains part of the files, I would like those to be overwritten and the nested directory structures to be merged).

FileUtils.copyDirectoryToDirectory(srcDir, destDir);
FileUtils.deleteDirectory(srcDir);

While this does the trick, in my opinion, it isn't efficient enough. There are at least two issues that come to mind:

  • You will need to have twice as much space.
  • If this is an SSD, copying the data over to another part of the drive and then erasing the old data will eventually have an impact on the on the hardware, as it will in effect shorten the hard disk's life.

What is the best approach to do this?

As per my understanding commons-io doesn't seem to be using the new Java 7/8 features available in Files. On the other hand, I was unable to get Files.move(...) to work, if the destDir exists (by "get it to work" I mean have it merge the directory structures -- it complains that the destDir exists).

Regarding failures to move (please correct me, if I am wrong):

  • As far as I understand, an atomic move is one that only succeeds, if all files are moved at once. If I understand this correctly, this means that this is copying first and then deleting. I don't think this is what I'm looking for.
  • If a certain path/file cannot be moved, then the operation should cease and throw an exception, preserving the current source path it reached.

Please, note that I am not limiting myself to using the commons-io library. I am open to suggestions. I am using Java 8.

carlspring
  • 31,231
  • 29
  • 115
  • 197
  • What's the deal with [`Files.move(source, target)`](https://docs.oracle.com/javase/tutorial/essential/io/move.html)? – Sergey Kalinichenko Aug 15 '15 at 13:20
  • 1
    Do you need to merge the "srcDir" into an existing "destDir"? Otherwise, wouldn't a simple "mv srcDir destDir" operation be sufficient? – Philipp Claßen Aug 15 '15 at 13:22
  • 3
    @dasblinkenlight This only works on files and empty directories. – Tunaki Aug 15 '15 at 13:22
  • What do you want to happen if the move fails? Atomic moves are not easy. – Thorbjørn Ravn Andersen Aug 15 '15 at 13:24
  • This should be executed daily, great number of times ? You dont have to watch every IO operation on SSD nowdays. – John Aug 15 '15 at 13:25
  • 1
    @Tunaki Not really: "if the directory is not empty, the move is allowed when the directory can be moved without moving the contents of that directory." – Sergey Kalinichenko Aug 15 '15 at 13:28
  • I don't want to use an external tool/command such as `rsync`, or `mv`. – carlspring Aug 15 '15 at 13:33
  • @John: No, it will be executed on-demand, it won't be a scheduled task. – carlspring Aug 15 '15 at 13:34
  • @PhilippClaßen: The `destDir` may, or may not exist. If it exists and does have some files/directories in it, then they should get overwritten and the directory structures of both directories should be merged. This is not an rsync-like synchronization of both sides, it would be a merge, only if the `destDir` exists. – carlspring Aug 15 '15 at 13:36
  • @ThorbjørnRavnAndersen: That's a good question. I suppose it should be an atomic action, but wouldn't that require a copy first? Another way would be to simply stop at the path that failed and give an error. – carlspring Aug 15 '15 at 13:41
  • I think you should consider carefully how things should work in case of problems (a.k.a production hardening). The answer to your question may give itself when you have that crystal clear. – Thorbjørn Ravn Andersen Aug 15 '15 at 14:06

3 Answers3

1

This is just an answer to the "what needs to happen to the filesystem" part of the question, not how to do it with Java.

Even if you did want to call out to an external tool, Unix mv is not like Windows Explorer. Same-name directories don't merge. So you will need to implement that yourself, or find a library function that does. There is no single Unix system call that does the whole recursive operation (let alone atomically), so it's something either your code or a library function has to do.

If you need to atomically cut from one version of a tree to another, you need to build a new tree. The files can be hard-links to the old version. i.e. do the equivalent of

cp -al dir/  new
rsync -a /path/to/some/stuff/  new/
# or maybe something smarter / custom that renames instead of copies files.

# your sanity check here

mv  dir old &&
mv  new dir &&   # see below for how to make this properly atomic
rm -rf old

This leaves a window where dir doesn't exist. To solve this, add a level of indirection, by making dir a symlink. Symlinks can be replaced atomically with mv (but not ln -sf). So in Java, you want something that will end up doing a rename system call, not an unlink / rename.


Unless you have a boatload of extremely small files (under 100 bytes), the directory metadata operations of building a hardlink farm are much cheaper than a full copy of a directory tree. The file data will stay put (and never even be read), the directory data will be a fresh copy. The file metadata (inodes) will be written for all files (to update the ctime and link count, when creating the hardlink farm, and again when removing the old tree, leaving files with the original link count.


If you're running on a recent Linux kernel, there is a new(2013) system call (called renameat2) available that can exchange two paths atomically. This avoids the symlink level of indirection. Using a Linux-only system call from Java is going to be more trouble than it's worth, though, since symlinks are easy.

Community
  • 1
  • 1
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
1

I am answering my own question, as I ended up writing my own implementation.

What I didn't like about the implementations of:

  • Apache Commons IO
  • Guava
  • Springframework

for moving files was that all of them first copy the directories and files and then delete them. (As far as I checked, September 2015) They all seem to be stuck with methods from JDK 1.6.

My solution isn't atomic. It handles the moving by walking the directory structure and performing the moves file by file. I am using the new methods from JDK 1.7. It does the job for me and I'm sure other people would like to do the same and wonder how to do it and then waste time. I have therefore created a small Github project, which contains an illustration here:

If anybody has suggestions on how to improve it, or would like to add features, please feel free to open a pull request.

carlspring
  • 31,231
  • 29
  • 115
  • 197
0

Traverse the source directory tree:

  • When meeting a directory, ensure the same directory exist in the target tree (and has the right permissions etc).
  • When meeting a file, rename it to the same name in the corresponding directory in the target tree.
  • When leaving a directory, ensure it is empty and delete it.

Consider carefully how any error should be handled.

Note that you might also simply call out to "rsync" if it is available on your system.

Thorbjørn Ravn Andersen
  • 73,784
  • 33
  • 194
  • 347
  • This sounds good and in theory, I know what roughly should happen. I'm looking for an example code-wise. Surely, I am not the first person looking to move a few files around, efficiently as in `mv`-style. – carlspring Aug 15 '15 at 13:39
  • You have misunderstood how stackoverflow works. _You_ try first, then people help you fix your code. – Thorbjørn Ravn Andersen Aug 15 '15 at 14:07
  • 2
    Actually... I've been around on SO for a while, not as long as you perhaps, but, I understand how it works. I can walk the directory structure myself with `Files.walk(...)` and end up implementing this myself. What I'm trying to understand is, whether or not this already exists in some library, as, quite frankly, moving files is not rocket science -- it's an everyday kind of task. From my understanding, `commons-io` doesn't use the new features in Java 8? At the same time, the new features in Java 8, don't seem to be behaving well, when the dest dir exists. – carlspring Aug 15 '15 at 14:23