I have an application working with a significant amount of data (100 GB+) stored in ESENT. The schema of the table is: 12-byte JET_bitColumnFixed keys and JET_coltypLongBinary values with a typical size of around 2 KiB. The page size is set to 32 KiB. I don't alter the default 1024 byte size threshold for external long values, so I think that these values are mostly being stored externally.
I am interested in improving the cold cache seek and retrieve performance, because the operations happen in batches and the keys are known in advance. As far as I understand, the JetPrereadKeys() API is designed to improve performance in such cases, but as it turns out, I don't see any changes in the actual behavior with or without this call.
More details follow:
In my case JetPrereadKeys() always reports an adequate number of pre-read keys, equal to the number of keys I have submitted when calling the API. The submitted keys are appropriately sorted, as stated in the documentation.
I tried both synchronous and asynchronous approaches, where the asynchronous approach is: send the pre-read call to a thread pool, while continuing to seek and retrieve data on the current thread.
I tried both available caching modes of ESENT, either where it uses MMAP or a dedicated page cache, by trying all available combinations of the JET_paramEnableViewCache and JET_paramEnableFileCache parameters.
I cannot, with a small exception, see any difference in the logged I/O operations with and without the pre-read. That is, I would expect this operation to result in a (preferably, asynchronous) fetch of the necessary internal nodes of a B-Tree. But the only thing I see is an occasional synchronous small read coming up from the stack of the JetPrereadKeys() itself. The size of the read is small, in the sense that I don't think that it could possibly prefetch all the required information.
If I debug the Windows Search service, I can break on various calls to JetPrereadKeys(). So there is at least one real-world example where this API is being called, presumably for a reason.
All my experiments were performed after a machine restart, to ensure that the database page cache is empty.
Questions:
What is the expected behavior of the JetPrereadKeys() in the described case?
Should I expect to see a different I/O pattern and better performance if I use this API? Should I expect to see a synchronous or an asynchronous pre-read of the data?
Is there an another approach that I could try to improve the I/O performance by somehow hinting ESENT about an upcoming batch?