We are running a simple multi-threaded java application which uses Berkeley-DB databases for its storage. There is about 500 threads and each thread has its own Berkeley-DB database - and each database is about 100K of value-key pairs. All databases are transactional and each transaction has maximum of about 1000 operations. No long running transactions.
The problem is that, occasionally, recovery of Berkeley-DB takes very very long time when restart our application. During recovery (opening the environment) we see that java process is reading from disk at rate of ~100MB/s. No writes - just reading.
Our setup is like this:
je.env.runCheckpointer=true
je.env.runCleaner=true
je.checkpointer.highPriority=true
je.cleaner.threads=256
je.cleaner.maxBatchFiles=10
je.log.checksumRead=false
je.lock.nLockTables=353
je.maxMemory=16106127360
je.log.nDataDirectories=256
We also tried running checkpoint manually every 15 minutes (assuming that maybe checkpointer stops or something). We also set setMinimizeRecoveryTime(true)
. But no help.
We assume that maybe the problem is some java or Berkeley DB configuration.
Is there a way to ensure faster recovery time while sacrificing speed of puts into database?