I have a program that uses RocksDB that tries to write huge amount of pairs of KEY-VALUE to database:
int main() {
DB* db;
Options options;
// Optimize RocksDB. This is the easiest way to get RocksDB to perform well
options.IncreaseParallelism(12);
options.OptimizeLevelStyleCompaction();
// create the DB if it's not already present
options.create_if_missing = true;
// open DB
Status s = DB::Open(options, kDBPath, &db);
assert(s.ok());
for (int i = 0; i < 1000000; i++)
{
// Put key-value
s = db->Put(WriteOptions(), "key" + std::to_string(i), "a hard-coded string here");
assert(s.ok());
}
delete db;
return 0;
}
When I run the program for the first time, it generated about 2GB of database, and I tried running this program for several times, without any changes, I got N*2GB of database with N=number-of-run
. Until a certain number of N, the database size started to reduce.
But whatever I expected is the new batch of data written to database should be overwritten after each run if the batch doesn't change -> Then the size of database should be ~2GB after each run instead.
QUESTION: is it an issue of RocksDB or if not, what is proper settings for it to keep database's size stable in case of similar written pairs of KEY-VALUE?