0

long time reader, first time poster.

I'm having a bit of trouble reading data quickly from a set of binary files. ByteBuffers and MappedBytBuffers offer the performance I require but they seem to require an initial run to warm up. I'm not sure if that makes sense so here's some code:

int BUFFERSIZE = 864;
int DATASIZE = 33663168;

int pos = 0;
// Open File channel to get data
FileChannel channel = new RandomAccessFile(new File(myFile), "r").getChannel();

// Set MappedByteBuffer to read DATASIZE bytes from channel
MappedByteBuffer mbb = channel.map(FileChannel.MapMode.READ_ONLY, pos, DATASIZE);

// Set Endianness
mbb.order(ByteOrder.nativeOrder());

ArrayList<Double> ndt = new ArrayList<Double>();

// Read doubles from MappedByteBuffer, perform conversion and add to arraylist
while (pos < DATASIZE) {
    xf = mbb.getDouble(pos);
    ndt.add(xf * cnst * 1000d + stt);
    pos += BUFFERSIZE;
}

// Return arraylist
return ndt;

So this takes about 7 seconds to run but if I then run it again it does it in 10ms. It seems that it needs to do some sort of initial run to set up the correct behaviour. I've found that by doing something simple like this works:

channel = new RandomAccessFile(new File(mdfFile), "r").getChannel();
ByteBuffer buf = ByteBuffer.allocateDirect(DATASIZE);
channel.read(buf);
channel.close();

This takes around 2 seconds and if I then run through the MappedByteBuffer procedure it returns the data in 10ms. I just cannot figure out how to get rid of that initialisation step and read the data in 10ms first time. I've read all sorts of things about 'warming up', JIT and the JVM but all to no avail.

So, my question is, is it possible to get the 10 ms performance straight away or do I need to do some sort of initialisation? If so, what is the fastest way to do this please?

The code is intended to run through around a thousand quite large files so speed is quite important.

Many thanks.

jonesds
  • 3
  • 2
  • Hey, you _do_ need to read from the file the first time... – fge Mar 31 '14 at 09:09
  • Wow, that was fast! Thank you. Any ideas on the fastest way to do the read for the first time? – jonesds Mar 31 '14 at 09:21
  • Uh, I don't know whether that is really an answer to your problem, but you can use the `.load()` method of `MappedByteBuffer` to load the mapping into memory; if you can `.load()` in the background if you have several files to open, then you can gain time – fge Mar 31 '14 at 09:23
  • Also, as to raw speed, this is really OS dependent; I don't know how Windows does that, but under Linux this will certainly be a call to `.mmap()` – fge Mar 31 '14 at 09:25

1 Answers1

1

I just cannot figure out how to get rid of that initialisation step and read the data in 10ms first time

You can't. The data does have to be read from the disk. That takes longer than 10ms. The 10ms is for all the other times when it's already in memory.

user207421
  • 305,947
  • 44
  • 307
  • 483