15

I am trying to "clean up" a ByteBuffer to be all zero bytes (all 0x00). I tried to loop over all positions in the buffer and set them to 0x00, but the efficiency is bad. Is there any better way to quickly clear a ByteBuffer - similar to what BitSet.clear() does?

Please note that ByteBuffer.clear() is not an appropriate solution for me in this scenario--I have to erase all the data inside of the buffer, and not just reset the pointer to the beginning.

Any hints?

Edit: the ByteBuffer is used as a part of the hash table, and it maintains the references of the hash table entries. Every time when the hash table needs to be flushed, I have to reset the hash table entries for later hash table insertion. Since the hash table is accessed in a random-fashion, I cannot simply clear() the state of the byte buffer.

Brad Mace
  • 27,194
  • 17
  • 102
  • 148
asksw0rder
  • 1,066
  • 1
  • 12
  • 19
  • Can you explain the use case in some more detail? What do you get the bytebuffer from? – jontro Jun 25 '12 at 22:01
  • 1
    Why do you think you need to zero out the buffer? – user207421 Jun 25 '12 at 22:04
  • Is it a direct buffer? If not, what about just `ByteBuffer.wrap(new byte[123456]);` – Greg Kopff Jun 25 '12 at 22:20
  • @GregKopff I want to eliminate the object creation as much as possible, so I will prefer reusing the ByteBuffer instead of wrapping a new one. – asksw0rder Jun 25 '12 at 22:53
  • @jontro more information about the ByteBuffer is added~ – asksw0rder Jun 25 '12 at 22:54
  • @asksw0rder That doesn't answer the question. There is no need to zero a ByteBuffer. It carries its own position and limit, so provided you program correctly you will only ever use what has been put into it. Just clear it. – user207421 Jun 25 '12 at 22:55
  • @EJP the important part of the hash table is that the reference is indexed by its position. So assume that we are inserting a key, the position of this key will be calculated using the hash function, meaning that the position is not continuously located over the buffer. If the content is not erased when flushing, we cannot tell whether an entry is inserted before the flushing or it is newly inserted. Hope my explanation is clear to you. – asksw0rder Jun 25 '12 at 22:58

4 Answers4

10

Have you tried using one of the ByteBuffer.put(byte[]) or ByteBuffer.put(ByteBuffer) methods to write multiple zeros in one go? You could then iterate over the buffer in chunks of 100 or 1000 bytes, or whatever, using an array or buffer pre-filled with zeros.

Downside: this is an optional operation, so not all implementations of ByteBuffer are required to provide it...

DNA
  • 42,007
  • 12
  • 107
  • 146
  • Will give it a try. Hopefully a bulk put will be better than the loop... thx! – asksw0rder Jun 25 '12 at 22:55
  • 4
    Sorry for this late reply, but this approach really works on reducing the flushing overhead. I have seen the flushing time decreased from ~60ms to ~2ms. Will see whether it is good enough. – asksw0rder Jun 27 '12 at 23:04
4

For ByteBuffer implementations that provide the optional array() method (where hasArray() returns true), you could use this method get a reference to the underlying array, then use java.util.Arrays#fill().

Jim Garrison
  • 85,615
  • 20
  • 155
  • 190
2

If you need a fresh clean zero-filled ByteBuffer after the hash table is flushed, the easiest way is to drop the existing ByteBuffer and allocate a new one. The official documentation does not say so, but all known implementations zero the memory of new buffers. See http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6535542 for additional info.

Nathan
  • 8,093
  • 8
  • 50
  • 76
Alex Cohn
  • 56,089
  • 9
  • 113
  • 307
2

As DNA mentions, having a pre-filled buffer and using ByteBuffer.put(ByteBuffer) is probably the fastest portable way. If that's not practical, you can do something like this to take advantage of either Arrays.fill or Unsafe.putLong when applicable:

public static void fill(ByteBuffer buf, byte b) {
    if (buf.hasArray()) {
        final int offset = buf.arrayOffset();
        Arrays.fill(buf.array(), offset + buf.position(), offset + buf.limit(), b);
        buf.position(buf.limit());
    } else {
        int remaining = buf.remaining();
        if (UNALIGNED_ACCESS) {
            final int i = (b << 24) | (b << 16) | (b << 8) | b;
            final long l = ((long) i << 32) | i;
            while (remaining >= 8) {
                buf.putLong(l);
                remaining -= 8;
            }
        }
        while (remaining-- > 0) {
            buf.put(b);
        }
    }
}

Setting UNALIGNED_ACCESS requires some knowledge of your JRE implementation and platform. Here's how I would set it for the Oracle JRE when also using JNA (which provides Platform.ARCH as a convenient, canonical way to access the os.arch system property).

/**
 * Indicates whether the ByteBuffer implementation likely supports unaligned
 * access of multi-byte values on the current platform.
 */
private static final boolean UNALIGNED_ACCESS = Platform.ARCH.startsWith("x86");
Trevor Robinson
  • 15,694
  • 5
  • 73
  • 72