89

What is the difference between a byte array & byte buffer ?
Also, in what situations should one be preferred over the other?

[my usecase is for a web application being developed in java].

Rajat Gupta
  • 25,853
  • 63
  • 179
  • 294

3 Answers3

95

There are actually a number of ways to work with bytes. And I agree that it's not always easy to pick the best one:

  • the byte[]
  • the java.nio.ByteBuffer
  • the java.io.ByteArrayOutputStream (in combination with other streams)
  • the java.util.BitSet

The byte[] is just a primitive array, just containing the raw data. So, it does not have convenient methods for building or manipulating the content.

A ByteBuffer is more like a builder. It creates a byte[]. Unlike arrays, it has more convenient helper methods. (e.g. the append(byte) method). It's not that straightforward in terms of usage. (Most tutorials are way too complicated or of poor quality, but this one will get you somewhere. Take it one step further? then read about the many pitfalls.)

You could be tempted to say that a ByteBuffer does to byte[], what a StringBuilder does for String. But there is a specific difference/shortcoming of the ByteBuffer class. Although it may appear that a bytebuffer resizes automatically while you add elements, the ByteBuffer actually has a fixed capacity. When you instantiate it, you already have to specify the maximum size of the buffer.

That's one of the reasons, why I often prefer to use the ByteArrayOutputStream because it automatically resizes, just like an ArrayList does. (It has a toByteArray() method). Sometimes it's practical, to wrap it in a DataOutputStream. The advantage is that you will have some additional convenience calls, (e.g. writeShort(int) if you need to write 2 bytes.)

BitSet comes in handy when you want to perform bit-level operations. You can get/set individual bits, and it has logical operator methods like xor(). (The toByteArray() method was only introduced in java 7.)

Of course depending on your needs you can combine all of them to build your byte[].

bvdb
  • 22,839
  • 10
  • 110
  • 123
28

ByteBuffer is part of the new IO package (nio) that was developed for fast throughput of file-based data. Specifically, Apache is a very fast web server (written in C) because it reads bytes from disk and puts them on the network directly, without shuffling them through various buffers. It does this through memory-mapped files, which early versions of Java did not have. With the advent of nio, it became possible to write a web server in java that is as fast as Apache. When you want very fast file-to-network throughput, then you want to use memory mapped files and ByteBuffer.

Databases typically use memory-mapped files, but this type of usage is seldom efficient in Java. In C/C++, it's possible to load up a large chunk of memory and cast it to the typed data you want. Due to Java's security model, this isn't generally feasible, because you can only convert to certain native types, and these conversions aren't very efficient. ByteBuffer works best when you are just dealing with bytes as plain byte data -- once you need to convert them to objects, the other java io classes typically perform better and are easier to use.

If you're not dealing with memory mapped files, then you don't really need to bother with ByteBuffer -- you'd normally use arrays of byte. If you're trying to build a web server, with the fastest possible throughput of file-based raw byte data, then ByteBuffer (specifically MappedByteBuffer) is your best friend.

JRalph
  • 349
  • 2
  • 4
  • 2
    It is not the Java security model that is the limitation. It is the JVM architecture that prevents you from casting bytes to typed data. – Stephen C Mar 06 '11 at 14:25
  • The security model also affects the usability of ByteBuffer -- at least in my testing which is a few years old now. Every time you call one of the cast functions in the ByteBuffer class, SecurityManager code gets executed, which slows the whole process down. This is why regular java io functions are generally faster for reading in java basic types. This contrasts with C, where memory mapped files with a cast are much, much faster than using stdio. – JRalph Mar 06 '11 at 14:39
  • 2
    Looking at the code, the security manager calls only appear to occur in DirectByteBuffer case. I think it happens because the method is using `Unsafe`. – Stephen C Mar 06 '11 at 15:34
3

Those two articles may help you http://nadeausoftware.com/articles/2008/02/java_tip_how_read_files_quickly and http://evanjones.ca/software/java-bytebuffers.html

bluefoot
  • 10,220
  • 11
  • 43
  • 56
  • 1
    I can not reproduce the first link's conclusion that FileChannel is relevantly faster than FileInputStream for reading into byte[]. I suspect that since they use a file of length 100MB, they actually benchmark reading from the operating system's disk cache rather than the hard drive itself. That would explain why their tests imply a bandwith of 250MB/s, which is pretty damn fast for a disk. In my tests with a 1.5GB file, both methods achieve a throughput of 40MB/s, indicating that the disk is the bottleneck, not the CPU. Of course, mileage with a solid state disk might differ. – meriton Mar 06 '11 at 15:01
  • 7
    You could improve the quality of this answer by letting us know why these links might be helpful. Link-only answers are not ideal. – james.garriss Aug 11 '15 at 14:30