9

I'm facing a little weird situation.

I'm copying from FileInputStream to FileOutputStream a file that is sized around 500MB. It goes pretty well (takes around 500ms). When I close this FileOutputStream the FIRST time, it takes about 1ms.

But here comes the catch, when I run this again, every consecutive close takes around 1500-2000ms! The duration is dropped back to 1ms when I delete this file.

Is there some essential java.io knowledge I'm missing?

It seems to be related to OS. I'm running on ArchLinux (the same code run on Windows 7 have all the times under 20ms). Note that it doesn't matter if it runs in OpenJDK or Oracle's JDK. Hard drive is a solid state drive with ext4 file-system.

Here is my testing code:

public void copyMultipleTimes() throws IOException {
    copy();
    copy();
    copy();
    new File("/home/d1x/temp/500mb.out").delete();
    copy();
    copy();
    // Runtime.getRuntime().exec("sync") => same results
    // Thread.sleep(30000) => same results
    // combination of sync & sleep => same results
    copy();
}

private void copy() throws IOException {
    FileInputStream fis = new FileInputStream("/home/d1x/temp/500mb.in");
    FileOutputStream fos = new FileOutputStream("/home/d1x/temp/500mb.out");
    IOUtils.copy(fis, fos); // copyLarge => same results
    // copying takes always the same amount of time, only close "enlarges"

    fis.close(); // input stream close this is always fast
    // fos.flush(); // has no effect 
    // fos.getFD().sync(); // Solves the problem but takes ~2.5s

    long start = System.currentTimeMillis();
    fos.close();
    System.out.println("OutputStream close took " + (System.currentTimeMillis() - start) + "ms");
}

The output is then:

OutputStream close took 0ms
OutputStream close took 1951ms
OutputStream close took 1934ms
OutputStream close took 1ms
OutputStream close took 1592ms
OutputStream close took 1727ms
zdenda.online
  • 2,451
  • 3
  • 23
  • 45
  • Java7 running on Ubuntu 13.10 has the same behavior. – Chris K Aug 14 '14 at 13:27
  • Could you try to explicitly `flush` `fos` before the `close` (of course before `long start = System.currentTimeMillis();`)? – A4L Aug 14 '14 at 13:30
  • This does indeed appear to be OS specific, as I can't reproduce it under Win7 (getting inconsistent results) – blgt Aug 14 '14 at 14:07
  • That is true Chris but the time is the same on the "classic" HDD. – zdenda.online Aug 14 '14 at 14:26
  • 1
    I'd be astonished if it had anything to do with Java at all. Try it with 'cp'. I expect you will get very similar results. It most likely has to to with how the solid state drive works. – user207421 Aug 14 '14 at 23:33
  • You are absolutely right EJP. I did a little bash script with the exact same results – zdenda.online Aug 15 '14 at 08:22

3 Answers3

2

@Duncan proposed the following explanation:

The first call to close() returns quickly, yet the OS is still flushing data to disk. The subsequent calls to close() can't complete until the previous flushing is complete.

I think this is close to the mark, but not exactly correct.

I think that what is actually going on here is that the first copy is filling up the operating system's file buffer cache with large numbers of dirty pages. The internal daemon that flushes the dirty pages to discs may start working on them, but it is still going when you start the second copy.

When you do the second copy, the OS tries to acquire buffer cache pages for reading and writing. But since the buffer cache is full of dirty pages the read and write calls are repeatedly blocked, waiting for free pages to become available. But before a dirty page can be recycled, the data in the page needs to be written to disc. The net result is that the copy slows down to the effective data write rate.


A 30 second pause may not be sufficient to complete flushing the dirty pages to disc.

One thing you could try is to do an fsync(fd) or fdatasync(fd) between the copies. In Java, the way to do that is to call FileDescriptor.sync().

Now, I can't say if this is going to improve total copy throughput, but I'd expect a sync operation to be better at writing out (just) one file than relying on the page eviction algorithm to do it.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • Why would it do that? – Stephen C Aug 14 '14 at 13:48
  • I tried to add fos.getFD().sync() before fos.close(). It really helps in the way that close now takes only 1ms (but sync takes between 2-2.5s). – zdenda.online Aug 14 '14 at 14:36
  • Do you think that adding this sync (even when it takes more time than stream copy) is proving your thoughts? If yes I will probably mark your answer as correct - it is the best explanation so far. – zdenda.online Aug 15 '14 at 07:48
  • 1
    No ... I don't think it proves it. It supports my theory though. Also, sync probably isn't the way to maximize copy throughput. But then again, I'm not convinced that your test is a realistic way to measure copy throughput. (Repeatedly copying the same file ....) – Stephen C Aug 15 '14 at 07:54
  • When copying different input files (every copy has its own source file), the results are the same. When copying to different output file, then closing is fast of course. Fix is easy - delete the file if doing overwrite (as Files.copy do it). I didn't want to measure throughput - I was just curious why it is happening and your answer is the best so far when it comes to question "Why?" :-) – zdenda.online Aug 15 '14 at 08:04
  • I'm going to write down the response where I summarize all the answers. Thank you very much for your thoughts. – zdenda.online Aug 15 '14 at 08:24
  • 1
    Copying multiple inputs to one output file (unless you are appending) is equally unrealistic. I think these are simply "use-cases" that the Linux developers don't think are worth tuning. – Stephen C Aug 15 '14 at 08:29
1

You seem on to something interesting. Under Linux someone is allowed to be holding a file handle to the original file, when you open it, actually deleting the directory entry and starting afresh. This does not bother the original file (handle). On closing than, maybe some disk directory work happens.

Test it with IOUtils.copyLarge and Files.copy:

Path target = Paths.get("/home/d1x/temp/500mb.out");
Files.copy(fis, target, StandardCopyOption.REPLACE_EXISTING);

(I once saw a IOUtils.copy that just called copyLarge, but Files.copy should act nice.)

Joop Eggen
  • 107,315
  • 7
  • 83
  • 138
  • copyLarge has no effect but using Files.copy works fine - it always takes only the time needed for copy (there is no stream closing and the Files.copy itself takes the same time as IOUtils). – zdenda.online Aug 14 '14 at 13:50
  • So as one leaves the closing to Files.copy, that does it right. (Happy) Maybe IOUtils wraps a BufferedInputStream around the InputStream and something weird on flush/close happens? Do not understand, and as IOUtils now is obsolete... – Joop Eggen Aug 14 '14 at 14:01
  • It is probably not connected to IOUtils but the FileOutputStream handling. If you do the copy on your own (using own byte[] buffer and outputStream.write-s) the result will be the same (long delays between close). If you look inside IOUtils.copy you will see it does not do any magic (simple outputStream.write in while loop) – zdenda.online Aug 14 '14 at 14:04
  • Then it is worth looking into `Files.copy(InputStream, Path)` - somewhen. – Joop Eggen Aug 14 '14 at 14:56
  • There is a delete before the copy if the file exist. That is why it works fine (example works also when you call delete before copy). – zdenda.online Aug 14 '14 at 14:58
  • @d1x aha, Files.copy checks REPLACE_EXISTING and then does a delete itself. Almost gives me the urge to start the C compiler. – Joop Eggen Aug 14 '14 at 15:08
0

Note that this question was asked because I was curious why this is happening, it was not meant to be measurement of copy throughput.

To summarize:

As EJP noted, the whole thing is not connected to Java. The result is the same if multiple consecutive cp commands are run in bash script.

The best answer why is this happening is Stephen's one - fsync between copy calls removes the issue (but fsync itself takes ~2.5s).

The best way to solve this is to do it as Files.copy(I, o, REPLACE_EXISTING) (as in Joop's answer) => First check if target file exists and if so delete it (instead of "overwriting"). Then you can write and close stream fast.

Community
  • 1
  • 1
zdenda.online
  • 2,451
  • 3
  • 23
  • 45