Couchbase benchmark reveals very slow INSERTs and GETs (using KeyValue operations); slower than persisted MySQL data

Question

I did a small benchmark test to compare Couchbase (running in Win) with Redis and MySql (EDIT: added Aerospike to test)

We are inserting 100 000 JSON "documents" into three db/stores:

Redis (just insert, there is nothing else)
Couchbase (in-memory Ephemeral buckets, JSON Index on JobId)
MySql (Simple table; Id (int), Data (MediumText), index on Id)
Aerospike (in-memory storage)

The JSON file is 67 lines, about 1800 bytes.

INSERT:

Couchbase: 60-100 seconds (EDIT: seems to vary quite a bit!)
MySql: 30 seconds
Redis: 8 seconds
Aerospike: 71 seconds

READ: We are reading 1000 times, and we do this 10 times and look at averages.

Couchbase: 600-700 ms for 1000 GETs (Using KeyValue operations, not Query API. Using Query API, this takes about 1500 ms)
MySql: 90-100 ms for 1000 GETs
Redis: 50-60 ms for 1000 GETs
Aerospike: 750 ms for 1000 GETs

Conclusion: Couchbase seems slowest (the INSERT times varies a lot it seems), Aerospike is also very slow. Both of these are using in-memory storage (Couchbase => Ephemeral bucket, Aerospike => storage-engine memory).

Question: Why the in-memory write and read on Couchbase so slow, even slower than using normal MySQL (on an SSD)?

CODE

Note: Using Task.WhenAll, or awaiting each call, doesn't make a difference.

INSERT

Couchbase:

IBucket bucket = await cluster.BucketAsync("halo"); // <-- ephemeral 
IScope scope = bucket.Scope("myScope");
var collection = scope.Collection("myCollection");

// EDIT: Added this to avoid measuring lazy loading:
JObject t = JObject.FromObject(_baseJsonObject);
t["JobId"] = 0;
t["CustomerName"] = $"{firstnames[rand.Next(0, firstnames.Count - 1)]} {lastnames[rand.Next(0, lastnames.Count - 1)]}";
await collection.InsertAsync("0", t);
await collection.RemoveAsync("0");

List<Task> inserTasks = new List<Task>();
sw.Start();
foreach (JObject temp in jsonObjects) // jsonObjects is pre-created so its not a factor in the test
{
    inserTasks.Add(collection.InsertAsync(temp.GetValue("JobId").ToString(), temp));
}
await Task.WhenAll(inserTasks);
sw.Stop();
Console.WriteLine($"Adding {nbr} to Couchbase took {sw.ElapsedMilliseconds} ms");

Redis (using ServiceStack!)

sw.Restart();
using (var client = redisManager.GetClient())
{
    foreach (JObject temp in jsonObjects)
    {
        client.Set($"jobId:{temp.GetValue("JobId")}", temp.ToString());
    }
}
sw.Stop();
Console.WriteLine($"Adding {nbr} to Redis took {sw.ElapsedMilliseconds} ms");
sw.Reset();

Mysql:

MySql.Data.MySqlClient.MySqlConnection mySqlConnection = new MySql.Data.MySqlClient.MySqlConnection("Server=localhost;Database=test;port=3306;User Id=root;password=root;");
mySqlConnection.Open();
sw.Restart();
foreach (JObject temp in jsonObjects)
{
    MySql.Data.MySqlClient.MySqlCommand cmd = new MySql.Data.MySqlClient.MySqlCommand($"INSERT INTO test (id, data) VALUES ('{temp.GetValue("JobId")}', @data)", mySqlConnection);
    cmd.Parameters.AddWithValue("@data", temp.ToString());
    cmd.ExecuteNonQuery();
}
sw.Stop();
Console.WriteLine($"Adding {nbr} to MySql took {sw.ElapsedMilliseconds} ms");
sw.Reset();

READ

Couchbase:

IBucket bucket = await cluster.BucketAsync("halo");
IScope scope = bucket.Scope("myScope");
var collection = scope.Collection("myCollection");


    Stopwatch sw = Stopwatch.StartNew();
    for (int i = 0; i < 1000; i++)
    {
        string key = $"{r.Next(1, 100000)}";
        var result = await collection.GetAsync(key);
    }
    sw.Stop();
    Console.WriteLine($"Couchbase Q: {q}\t{sw.ElapsedMilliseconds}");

Redis:

    Stopwatch sw = Stopwatch.StartNew();
    using (var client = redisManager.GetClient())
    {
        for (int i = 0; i < nbr; i++)
        {
            client.Get<string>($"jobId:{r.Next(1, 100000)}");
        }
    }
    sw.Stop();
    Console.WriteLine($"Redis Q: {q}\t{sw.ElapsedMilliseconds}");

MySQL:

MySqlConnection mySqlConnection = new MySql.Data.MySqlClient.MySqlConnection("Server=localhost;Database=test;port=3306;User Id=root;password=root;");
mySqlConnection.Open();
            
Stopwatch sw = Stopwatch.StartNew();
for (int i = 0; i < nbr; i++)
{
    MySqlCommand cmd = new MySql.Data.MySqlClient.MySqlCommand($"SELECT data FROM test WHERE Id='{r.Next(1, 100000)}'", mySqlConnection);
    using MySqlDataReader rdr = cmd.ExecuteReader();

    while (rdr.Read())
    {
    }
}
sw.Stop();
Console.WriteLine($"MySql Q: {q} \t{sw.ElapsedMilliseconds} ms");
sw.Reset();

Couchbase setup:

and

and Bucket Durability:

I only have 1 Node (no cluster), it's local on my machine, running Ryzen 3900x 12 cores, M.2 SSD, Win10, 32 GB RAM.

If you made it this far, here is a GitHub repo with my benchmark code: https://github.com/tedekeroth/CouchbaseTests

The Redis and Mysql code would be useful to ensure that a fair comparison is being made. For example, are the REDIS bulk APIs being used? — Paddy, Nov 25 '20 at 11:33
Nope, both mysql and redis are sequential inserts, not even using Task.WhenAll, and still blazing fast. I will upate with tht code too, as requested. — Ted, Nov 25 '20 at 11:39
@Ted for another datapoint for comparison, it would be interesting to see the results of running a similar test with the cbc-pillowfight tool (https://docs.couchbase.com/sdk-api/couchbase-c-client-2.4.8/md_doc_cbc-pillowfight.html) that will be in the bin directory of your Couchbase install. — Graham Pople, Nov 27 '20 at 10:07
Sorry, I didnt see this comment until now, weird, didn't get a "ping" here. Yeah, that might be interesting, but I have since unfortunately "move on" and looking at other solutions. I hope this will help someone else though. — Ted, Nov 30 '20 at 19:24
@Ted fair enough. Just want anyone stumbling across this to know that the times you're seeing are MUCH slower than we'd expect to see from Couchbase. I'm very confused by your findings elsewhere that performing the inserts in parallel wasn't much faster than in serial. If you ever get a chance to get back to this, would be interesting to see if tweaking the SDK NumKvConnections param helps. This controls the number of connections from the SDK to each Key-Value node, and it defaults to 1. In bulk-loading scenarios like this, it can help throughput. — Graham Pople, Dec 16 '20 at 14:45

Michael Reiche · Answer 1 · 2020-12-07T18:58:03.890

I took your CouchbaseTests, commented out the non-Couchbase bits. Fixed the query to select from the collection ( myCollection ) instead of jobcache, and removed the Metrics option. And created an index on JobId. create index mybucket_JobId on default:myBucket.myScope.myCollection (JobId) It inserts the 100,000 documents in 19 seconds and kv-fetches the documents on average 146 usec and query by JobId on average 965 usec.

Couchbase Q: 0 187
Couchbase Q: 1 176
Couchbase Q: 2 143
Couchbase Q: 3 147
Couchbase Q: 4 140
Couchbase Q: 5 138
Couchbase Q: 6 136
Couchbase Q: 7 139
Couchbase Q: 8 125
Couchbase Q: 9 129
average et: 146 ms per 1000 -> 146 usec / request

Couchbase Q: 0 1155
Couchbase Q: 1 1086
Couchbase Q: 2 1004
Couchbase Q: 3 901
Couchbase Q: 4 920
Couchbase Q: 5 929
Couchbase Q: 6 912
Couchbase Q: 7 911
Couchbase Q: 8 911
Couchbase Q: 9 927
average et: 965 ms per 1000 -> 965 usec / request. (coincidentally exactly the same as with the java api).

This was on 7.0 build 3739 on a Mac Book Pro with the cbserver running locally.

######################################################################

I have a small LoadDriver application for the java sdk that uses the kv api. With 4 threads, it shows an average response time of 54 micro-seconds and throughput of 73238 requests/second. It uses the travel-sample bucket on a cb server on localhost. git@github.com:mikereiche/loaddriver.git

Run: seconds: 10, threads: 4, timeout: 40000us, threshold: 8000us requests/second: 0 (max), forced GC interval: 0ms count: 729873, requests/second: 72987, max: 2796us avg: 54us, aggregate rq/s: 73238

For the query API I get the following which is 18 times slower.

Run: seconds: 10, threads: 4, timeout: 40000us, threshold: 8000us requests/second: 0 (max), forced GC interval: 0ms count: 41378, requests/second: 4137, max: 12032us avg: 965us, aggregate rq/s: 4144

Thanks for testing this. Can you provide a GitHub URL or something like it? I want to test your version of this. Also, as far as I understand, the KeyValue operations shouldnt need to set an index, the Key is always indexed I thought? In any case, not adding that index should make inserts faster anyways, right? — Ted, Dec 08 '20 at 03:46
https://github.com/mikereiche/CouchbaseTests Ted - I vaguely recall trying without the index on don't recall the inserts being faster. The indexing is asynchronous. — Michael Reiche, Dec 09 '20 at 15:30
@Ted that's correct, using the Key-Value API doesn't touch indexes. I think Michael is mentioning the index because he also tested N1QL queries. — Graham Pople, Dec 16 '20 at 14:40
With a GSI defined, inserts - using the kv-api or otherwise - would cause the new documents to be indexed (asynchronously). Like I said, that did not seem to result in an increase in the insert time - likely because the test inserts serially, and my machine has 6 cpus. — Michael Reiche, Dec 17 '20 at 21:07

score 1 · Answer 2 · answered Nov 25 '20 at 18:03

1

I would have to run such a comparison myself to do a full investigation, but two things stand out.

Your parallel execution isn't truly fully parallel. async methods run synchronously up to the first await, so all of the code in InsertAsync/GetAsync before the first await is running sequentially as you add your tasks, not parallel.
CouchbaseNetClient does some lazy connection setup in the background, and you're paying that cost in the timed section. Depending on the environment, including SSL negotiation and such things, this can be a significant initial latency.

You can potentially address the first issue by using Task.Run to kick off the operation, but you may need to pre-size the default Threadpool size.

You can address the second issue by doing at least one operation on the bucket (including bucket.WaitUntilReadyAsync()) before the timed section.

60 seconds for inserts still look abnormal. How many nodes and what Durability setting are you using?

answered Nov 25 '20 at 18:03

Richard P

123
4

Regarding "parallel execution": Yes, the tasks are created sequentually, but that is just the creating of the tasks, its not the work itself. I wrote above that creating tasks and then WhenAll didnt make a difference compare to awaiting each call. I re-ran it again, both using WhenAll as well as sequantial, awaiting each call. Its the same times, but now it takes 100 seconds instead... – Ted Nov 25 '20 at 19:36
2: So, if I do one insert outside the timed section, and then go? Would that remedy the lazy stuff? – Ted Nov 25 '20 at 19:40
I updated the question above, with some lazy-loading-remedy, as well as Durability setting for the Bucket, which is set to "none". – Ted Nov 25 '20 at 19:45
I re-ran it with the updated code (lazy-loading remedy). It now took 85 seconds. Also ran Redis INSERTs again, took 8.5 seconds. – Ted Nov 25 '20 at 19:51
Update: If I 1) Remove collection, re-add collection, add PRIMARY key; 2) Re-run the code to INSERT 100k, I again get 100 seconds. Seems to vary a bit. Compter is not under load, nothing is really running except Win10, a browser, Visual Studio. Still, Redis takes only 8 seconds... I added GitHub repo where I uploaded the test code. – Ted Nov 25 '20 at 20:05

Couchbase benchmark reveals very slow INSERTs and GETs (using KeyValue operations); slower than persisted MySQL data

CODE

Couchbase setup:

2 Answers2

Linked