4

I have an SSD disk which should supply not less than 10k IOPS per specification. My benchmark confirms that it can give me 20k IOPS.

Then I create such a test:

private static final int sector = 4*1024;
private static byte[] buf = new byte[sector];
private static int duration = 10; // seconds to run
private static long[] timings = new long[50000];
public static final void main(String[] args) throws IOException {
    String filename = args[0];
    long size = Long.parseLong(args[1]);
    RandomAccessFile raf = new RandomAccessFile(filename, "r");
    Random rnd = new Random();
    long start = System.currentTimeMillis();
    int ios = 0;
    while (System.currentTimeMillis()-start<duration*1000) {
        long t1 = System.currentTimeMillis();
        long pos = (long)(rnd.nextDouble()*(size>>12));
        raf.seek(pos<<12);
        int count = raf.read(buf);
        timings[ios] = System.currentTimeMillis() - t1;
        ++ios;
    }
    System.out.println("Measured IOPS: " + ios/duration);
    int totalBytes = ios*sector;
    double totalSeconds = (System.currentTimeMillis()-start)/1000.0;
    double speed = totalBytes/totalSeconds/1024/1024;
    System.out.println(totalBytes+" bytes transferred in "+totalSeconds+" secs ("+speed+" MiB/sec)");
    raf.close();
    Arrays.sort(timings);
    int l = timings.length;
    System.out.println("The longest IO = " + timings[l-1]);
    System.out.println("Median duration = " + timings[l-(ios/2)]);
    System.out.println("75% duration = " + timings[l-(ios * 3 / 4)]);
    System.out.println("90% duration = " + timings[l-(ios * 9 / 10)]);
    System.out.println("95% duration = " + timings[l-(ios * 19 / 20)]);
    System.out.println("99% duration = " + timings[l-(ios * 99 / 100)]);
}

And then I run this example and get just 2186 IOPS:

$ sudo java -cp ./classes NioTest /dev/disk0 240057409536
Measured IOPS: 2186
89550848 bytes transferred in 10.0 secs (8.540234375 MiB/sec)
The longest IO = 35
Median duration = 0
75% duration = 0
90% duration = 0
95% duration = 0
99% duration = 0

Why does it work so much slower than same test in C?

Update: here is Python code which gives 20k IOPS:

def iops(dev, blocksize=4096, t=10):

    fh = open(dev, 'r')
    count = 0
    start = time.time()
    while time.time() < start+t:
        count += 1
        pos = random.randint(0, mediasize(dev) - blocksize) # need at least one block left
        pos &= ~(blocksize-1)   # sector alignment at blocksize
        fh.seek(pos)
        blockdata = fh.read(blocksize)
    end = time.time()
    t = end - start
    fh.close()

Update2: NIO code (just a piece, will not duplicate all the method)

...
RandomAccessFile raf = new RandomAccessFile(filename, "r");
InputStream in = Channels.newInputStream(raf.getChannel());
...
int count = in.read(buf);
...
Anthony
  • 12,407
  • 12
  • 64
  • 88
  • 2
    Are you using the same sequence of random numbers in Java and C? Note that the raw disk transfer speed is irrelevant. For random access you need to look at seek times. – Patricia Shanahan Jun 27 '15 at 23:03
  • 4
    Why does writing 40000 .java files to me pocket usb drive take 8 minutes. Versus the 20 seconds (same cumulative size) of 1 mp4 i ripped off? I want my money back (for the usb drive) – AsConfused Jun 27 '15 at 23:09
  • 3
    Post the code for the same test in C, so readers can feel certain what is being compared. – Dan Getz Jun 27 '15 at 23:41
  • 2
    I suspect you're not comparing like for like. Java IO is not buffered unless you explicitly use buffering. The standard C APIs buffer by default unless you use the low level APIs. What's your C code look like? – Peter Brittain Jun 27 '15 at 23:43
  • Downvote for tendentious title. Java's disk I/O doesn't even exist, let alone suck. It juste calls the operating system, in very straightforward ways. Any performance problem is attributable to unstated differences between your Java code and your C code.. – user207421 Jun 27 '15 at 23:45
  • @EJP Is that all you can say on this question? – Anthony Jun 27 '15 at 23:48
  • 3
    Interesting that the class is named NioTest but contains no NIO code. While there's plenty of evidence that using NIO does not guarantee a speed increase, I still would like to see the same test done with a FileChannel, perhaps even with a MappedByteBuffer, since the question claims a deficiency with Java itself. – VGR Jun 27 '15 at 23:52
  • @Antonio, am I correct to guess you're on Mac OS X (given you're using `/dev/disk0` as a source?) I've noticed that when I do `sudo dd if=/dev/disk0 bs=4m of=/dev/null` and Ctrl-C after a few seconds, I get much lesser performance than if I use `if=/path/to/actual/file/on/dev/disk0/filesystem`. Can you try an actual file with your Java program? – Iwillnotexist Idonotexist Jun 27 '15 at 23:55
  • @VGR I removed some code to make my question more readable. Nio gave me same IOPS. – Anthony Jun 27 '15 at 23:55
  • 1
    @IwillnotexistIdonotexist good catch! I use Mac indeed. When you read from a file, it will work faster because of filesystem cache. However I tried anyway and got same IOPS on large files (~30Gb). – Anthony Jun 27 '15 at 23:57
  • 2
    This doesn't affect the IOPS result you're asking about, but using `System.nanoTime()` instead of `currentTimeMillis()` would allow you to measure individual timings with more precision. – Dan Getz Jun 27 '15 at 23:59
  • @Antonio Please share the NIO version of your test with us. – VGR Jun 28 '15 at 00:13
  • @VGR I've just added it. – Anthony Jun 28 '15 at 00:29
  • In case you're curious, the executed code in the JDK is here: [RandomAccessFile.java](http://hg.openjdk.java.net/jdk7/jdk7/jdk/file/9b8c96f96a0f/src/share/classes/java/io/RandomAccessFile.java#l354), [RandomAccessFile.c](http://hg.openjdk.java.net/jdk7/jdk7/jdk/file/9b8c96f96a0f/src/share/native/java/io/RandomAccessFile.c#l72) and [io_util.c](http://hg.openjdk.java.net/jdk7/jdk7/jdk/file/9b8c96f96a0f/src/share/native/java/io/io_util.c#l75). Note in particular that the code: goes native, `malloc()`'s, reads, _copies_ the data & returns, and to top it off makes several (Java) JNI calls _from C_. – Iwillnotexist Idonotexist Jun 28 '15 at 01:00
  • using NIO and writing to direct byte buffers might avoid the copy operation mentiont by @IwillnotexistIdonotexist – the8472 Jun 28 '15 at 01:19

4 Answers4

7

Your question is based on the false assumption that C code analogous to your Java code would perform as well as IOMeter does. Because this assumption is false, there is no discrepancy between C performance and Java performance to explain.

If your question is why your Java code performs so badly relative to IOMeter, the answer is that IOMeter doesn't issue requests one at a time like your code does. To get the full performance from your SSD, you need to keep its request queue non-empty, and waiting for each read to finish before issuing the next can't possibly do that.

Try using a pool of threads to issue your requests.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278
4

From this article and it is dated, legacy java random access is 2.5 to 3.5 times slower. It's a research pdf so don't blame me for your clicking it.

Link: http://pages.cs.wisc.edu/~guo/projects/736.pdf

Java raw I/O is slower than C/C++, since system calls in Java are more expensive; buffering improves Java I/O performance, for it reduces system calls, yet there is no big gain for larger buffer size; direct buffering is better than the Java-provided buffered I/O classes, since the user can tailor it for his own needs; increasing the operation size helps I/O performance without overheads; and system calls are cheap in Java native methods, while the overhead of calling native methods is rather high. When the number of native calls is reduced properly, a performance comparable to C/C++ can be achieved.

From that era is your code. Now let's rewrite it not using RandomAccessFile but rather java.nio shall we?

I have some nio2 code we can pit against C. Garbage collection can be ruled out :)

Drew
  • 24,851
  • 10
  • 43
  • 78
  • I believe I'm doing something wrong. I just can not figure out what's wrong. I tried NIO, but got same IOPS. If you can propose alternative code, it will be much appreciated. – Anthony Jun 28 '15 at 00:14
  • It won't be faster but it won't be 3.5 times slower – Drew Jun 28 '15 at 00:16
  • I used to do assembly and C only so I am not delusional. Well, mostly not. – Drew Jun 28 '15 at 00:18
1

Because you're using RandomAccessFile, which is one of the slowest methods of disk I/O in Java.

Try using something faster, like a BufferedInputStream or a BufferedOutputStream, and see what speeds you get.

If you're wondering why this would make a difference on an SSD (because SSDs are supposed to be good at random access), it's not about the randomness of the access; it's about the bandwidth. If you have an SSD with a 1024-bit-wide bus, but you're only writing 64 bits per write (as you would be doing by writing longs or doubles), you'll get slow speed. (These numbers are just for example purposes, of course.)

Now, I can see that that's not what your code is doing (or at least, appears to be doing), but it's quite possible that RandomAccessFile implements it that way under the hood. Again, try with a buffered stream and see what happens.

Sam Estep
  • 12,974
  • 2
  • 37
  • 75
  • I don't have 2TiB of memory to use BufferedInputStream – Anthony Jun 27 '15 at 23:30
  • 1
    Since when does `BufferedInputStream` required 2 TB of memory? – Sam Estep Jun 27 '15 at 23:31
  • 1
    Do you happen to know that BufferedInputStream is for sequential reading (and my test is for random reading)? – Anthony Jun 27 '15 at 23:33
  • 1
    This answer is not about my question at all. – Anthony Jun 27 '15 at 23:35
  • 1
    @Antonio Yes, I am aware of that. However, you didn't specify whether the 10k IOPS specification was for sequential or random access, or the details of your C I/O test. – Sam Estep Jun 27 '15 at 23:35
  • Can you explain why is Random used in my code from your point of view than? – Anthony Jun 27 '15 at 23:36
  • 1
    @Antonio Yes, I can see that this Java benchmark you've written is meant to be random-access. However, I cannot see any details about the specification or your "benchmark" that "confirms that it can give me 20k IOPS." As you expressed incredulity at the apparent massive difference in speed between Java and C in this case, my first guess was that your code was flawed. As a programmer, you should know that that should be your first guess as well. – Sam Estep Jun 27 '15 at 23:39
  • 1
    @Antonio I'm afraid I don't believe you. Please post the code from your C benchmark. – Sam Estep Jun 27 '15 at 23:43
  • I do not need you to believe me. You can read hard drive specification if you are interested: http://ocz.com/consumer/agility-3/specifications – Anthony Jun 27 '15 at 23:53
  • @Antonio Thank you. Of course, you still need to post your C benchmark code if you want any help. – Sam Estep Jun 27 '15 at 23:56
  • 1
    @Antonio that doesn't seem very nice. Especially since you came here asking for help to begin with. This is like me saying "There is something wrong with Ford because the Focus goes to 122mph. My car at home goes to 700mph." – Obicere Jun 28 '15 at 00:03
  • 2
    @Antonio You asked a question on SO—anyone is free to help you. If you disagree with an answer/think it is wrong, down vote it and move on. Don't ask someone to delete an answer. Also, please don't be rude/belligerent (in comments or in question titles). – royhowie Jun 28 '15 at 00:06
  • I'm sorry, guys, I really asked for help, but not for flood. As soon as there is at least one answer number of readers decreases significantly. That's why flood like that prevent me from finding people who knows the answer. – Anthony Jun 28 '15 at 00:10
  • @Antonio That's just how Stack Exchange works. I have no control over that. If you don't like it, raise a complaint on [meta](http://meta.stackexchange.com/). – Sam Estep Jun 28 '15 at 00:11
  • 5
    @Antonio - Having an answer does not prevent you from getting another. Being rude to people who answer you does. Please understand that RedRoboHood is volunteering their time to try to help you. – Brad Larson Jun 28 '15 at 00:11
1

RandomAccess is mostly fast in Java, but can't compare to C. But if you want a better comparison on IO Performance on the JVM read Martin Thompson excellent blog on the subject : http://mechanical-sympathy.blogspot.co.uk/2011/12/java-sequential-io-performance.html

user586050
  • 84
  • 3