Chronicle: How to optimize memory-mapped files for low-latency?

Question

I'm using Chronicle to transfer vast amounts of data from one JVM to another. Problem is that I notice a lot of jitter on my benchmarks. My knowledge of memory-mapped files is somewhat limited, but I do know that the OS swaps pages back and forth from memory to disk.

How do I configure those pages for maximum performance, in my case, for less jitter and the lowest possible latency, when using Chronicle? Do they need to be big or small? Do they need to be many or few?

Here is what I currently have on my Ubuntu box:

$ cat /proc/meminfo | grep Huge
AnonHugePages:      2048 kB
ShmemHugePages:        0 kB
HugePages_Total:       1
HugePages_Free:        1
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:    1048576 kB

Don't use memory-mapped files if you need a guaranteed latency. (mapped shared memory objects are fine though) There is no way to optimise. Use file access with O_SYNC and O_DIRECT, lock the process in memory, disable swap, and do all the other stuff mandatory for a real-time latency. Stock JVM is also not an option, there are some real-time ready JVM versions though. — SK-logic, Feb 08 '22 at 08:38
`mapped shared memory objects are fine though` => how does that work? I thought for obvious security reasons you would never be allowed to access other process memory, unless you use a memory-mapped file as a bridge. You can't simply access any block of memory from your application, without first allocating it for itself. — Roger Kasinsky, Feb 08 '22 at 12:57
A shared memory region in Unix is a special block device "file" that you can `mmap()`. — SK-logic, Feb 08 '22 at 13:46
But then how another process will gain access to this same region of memory? It sounds forbidden, no? — Roger Kasinsky, Feb 08 '22 at 14:04
So there is no page swapping inside `/dev/shm/`? Everything just goes to memory? So provided you have enough RAM memory it is a much faster solution than memory-mapped files, correct? — Roger Kasinsky, Feb 08 '22 at 17:18
Yes, correct - mmap is only ok for shared memory regions, provided you mlock-ed them all and don't have swap. Mmap cannot be used for real file i/o, anything that'd access slow data storage. If everything configured correctly you should see a zero latency communication. — SK-logic, Feb 08 '22 at 20:07
@SK-logic you can get the 99.99% latency well below 10 microseconds with stock JVMs. — Peter Lawrey, Feb 09 '22 at 13:02
`/dev/shm` is still a memory-mapped file, just not backed by disk. — Peter Lawrey, Feb 09 '22 at 13:03
@PeterLawrey, not if you do any file i/o. Not if there's any context switching involved (and with stock jvm it's hardly possible to avoid). You will have an average latency below 10us, but there will be few ms spikes. — SK-logic, Feb 09 '22 at 14:19
https://stackoverflow.com/questions/71059608/best-practices-for-java-ipc-through-dev-shm-with-the-lowest-possible-latency — Roger Kasinsky, Feb 10 '22 at 03:54

score 1 · Accepted Answer · answered Feb 09 '22 at 13:01

1

Assuming you have Linux, you can enable sparse files with useSparseFiles(true) on the builder.

You can also use a faster drive to reduce outliers or /dev/shm.

There is an asynchronous mode in the closed source version, however, you can get most outliers well below 80 microseconds without it.

Chronicle Queue doesn't use Huge pages.

Here is a chart I created when I was comparing it to Kafka, writing to a Corsair MP600 Pro XT.

http://blog.vanillajava.blog/2022/01/benchmarking-kafka-vs-chronicle-for.html

NOTE: This is the latency for two hops writing and reading an object of around 220 bytes (with serialization)

answered Feb 09 '22 at 13:01

Peter Lawrey

525,659
79
751
1,130

Thanks, Peter. Where can I read more about `sparse files`? Is it something offered by the operating system to increase performance? – Roger Kasinsky Feb 09 '22 at 13:37
https://stackoverflow.com/questions/71059608/best-practices-for-java-ipc-through-dev-shm-with-the-lowest-possible-latency – Roger Kasinsky Feb 10 '22 at 03:54
Are those numbers shown in the graph average latencies? – Roger Kasinsky Feb 12 '22 at 20:35
@rogerkasinsky Unix supports sparse files which results in only the page actually written to being allocated on disk or memory. The average latency is between the typical (50%) and 90% latencies. Average latencies are a great way to hide occasional poor performance so a latency distribution is better. – Peter Lawrey Feb 14 '22 at 18:52

Chronicle: How to optimize memory-mapped files for low-latency?

1 Answers1

Linked