There are few primary options for dealing with large files in Java.
Pro:
- Simple, though old school API
Con:
- API is call heavy (both Java calls and syscalls) and suboptimal from performance point of view.
This API may still be good for simple read only or write only case.
Compared to RandomAccesFile
this class provide buffer oriented API which may be an advantage for both performance and organization of code.
FileChannel
could also be configured for non-blocking IO which is important for maxin out your disk I/O.
Additionally FileChannel
offers utility for zero copy data transfer file-file, socket-file, file-socket.
Memory mapped buffers
Memory mapped buffers is another IO option available via FileChannel
.
In theory, memory mapping have minimal possible overhead for disk access, though in practice performance using is on-par with FileChannel
.
Memory mapped buffer bring number of problems though:
- Memory mapped ByteBuffer can be closed only by GC, so underlying file will remain open for unpredictable time (specifically painful in Windows).
- Memory mapped operations are causing page fault which are interfering with JVM system thread. As a consequence application with memory mapped IO and experience frequent STW stalls.
- Each memory mapped buffer is limited to 2 GiB. So managing multiple buffers is required. These buffers MUST be reused as they cannot be explicitly closed.
Memory mapped buffers may be a good choice if you are working with DB like data structures, have hard to predict access pattern and want to rely on OS caching is stead of own buffer management.
Still mentioned limitation and lack of performance benefits make memory mapped IO very niche solution in Java world