0

I'm testing Cassandra 4.0 with 3 nodes as a POC, all the nodes are using VMs with 8GB RAM and 2 cores. The VMs were created in a way to make sure that they do not share I/O.

I started the 1st node, and with 50 threads in the client, it takes 7 sec to insert 150,000 records(No batch). So the write speed is 22k/sec. Then I added 2nd node, and started another client(write to different table at the same time with the first client) with 50 threads, also insert 150k records, it takes 18sec for both clients to finish, so the write speed becomes 16k/sec. Last, I added 3rd node, with the 2 clients, it takes 27 sec for 300k records to insert, so the write speed becomes 11k/sec. Apparently, the write speed decreased with more nodes added.

I checked CPU usage and it is around 70~80%.

Here is the result from "nodetool status":

Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address      Load      Tokens  Owns  Host ID                               Rack 
UN  10.30.1.1  1.65 GiB  16      ?     4d379ca0-362b-4077-b650-c589088e86ed  rack1
UN  10.30.1.2  3.1 GiB   16      ?     b0d37f83-dfaf-45ae-9749-25f2d6746d0e  rack1
UN  10.30.1.3  2.7 GiB   16      ?     8a48959b-a2a4-4543-abbf-257ddb7ca5b1  rack1

And result from "nodetool tpstats":

Pool Name                    Active Pending Completed Blocked All time blocked
RequestResponseStage         0      0       1615905   0       0               
MutationStage                0      0       2090208   0       0               
ReadStage                    0      0       1466      0       0               
CompactionExecutor           0      0       1239      0       0               
MemtableReclaimMemory        0      0       6         0       0               
PendingRangeCalculator       0      0       4         0       0               
GossipStage                  0      0       7695      0       0               
SecondaryIndexManagement     0      0       0         0       0               
HintsDispatcher              0      0       0         0       0               
MemtablePostFlush            0      0       11        0       0               
PerDiskMemtableFlushWriter_0 0      0       6         0       0               
ValidationExecutor           0      0       0         0       0               
Sampler                      0      0       0         0       0               
ViewBuildExecutor            0      0       0         0       0               
MemtableFlushWriter          0      0       6         0       0               
CacheCleanupExecutor         0      0       0         0       0               
Native-Transport-Requests    0      0       2202576   0       0               

Latencies waiting in queue (micros) per dropped message types
Message type           Dropped     50%               95%       99%                Max               
READ_RSP               0           0.0               0.0       0.0                0.0               
RANGE_REQ              0           0.0               0.0       0.0                0.0               
PING_REQ               0           0.0               0.0       0.0                0.0               
_SAMPLE                0           0.0               0.0       0.0                0.0               
VALIDATION_RSP         0           0.0               0.0       0.0                0.0               
SCHEMA_PULL_RSP        0           0.0               0.0       0.0                0.0               
SYNC_RSP               0           0.0               0.0       0.0                0.0               
SCHEMA_VERSION_REQ     0           0.0               0.0       0.0                0.0               
HINT_RSP               0           0.0               0.0       0.0                0.0               
BATCH_REMOVE_RSP       0           0.0               0.0       0.0                0.0               
PAXOS_COMMIT_REQ       0           0.0               0.0       0.0                0.0               
SNAPSHOT_RSP           0           0.0               0.0       0.0                0.0               
COUNTER_MUTATION_REQ   0           0.0               0.0       0.0                0.0               
GOSSIP_DIGEST_SYN      0           943.1270000000001 1955.666  2816.159           2816.159          
PAXOS_PREPARE_REQ      0           0.0               0.0       0.0                0.0               
PREPARE_MSG            0           0.0               0.0       0.0                0.0               
PAXOS_COMMIT_RSP       0           0.0               0.0       0.0                0.0               
HINT_REQ               0           0.0               0.0       0.0                0.0               
BATCH_REMOVE_REQ       0           0.0               0.0       0.0                0.0               
STATUS_RSP             0           0.0               0.0       0.0                0.0               
READ_REPAIR_RSP        0           0.0               0.0       0.0                0.0               
GOSSIP_DIGEST_ACK2     0           654.9490000000001 3379.391  4055.2690000000002 4055.2690000000002
CLEANUP_MSG            0           0.0               0.0       0.0                0.0               
REQUEST_RSP            0           0.0               0.0       0.0                0.0               
TRUNCATE_RSP           0           0.0               0.0       0.0                0.0               
UNUSED_CUSTOM_VERB     0           0.0               0.0       0.0                0.0               
REPLICATION_DONE_RSP   0           0.0               0.0       0.0                0.0               
SNAPSHOT_REQ           0           0.0               0.0       0.0                0.0               
ECHO_REQ               0           0.0               0.0       0.0                0.0               
PREPARE_CONSISTENT_REQ 0           0.0               0.0       0.0                0.0               
FAILURE_RSP            0           0.0               0.0       0.0                0.0               
BATCH_STORE_RSP        0           0.0               0.0       0.0                0.0               
SCHEMA_PUSH_RSP        0           0.0               0.0       0.0                0.0               
MUTATION_RSP           0           2816.159          10090.808 17436.917          89970.66          
FINALIZE_PROPOSE_MSG   0           0.0               0.0       0.0                0.0               
ECHO_RSP               0           0.0               0.0       0.0                0.0               
INTERNAL_RSP           0           0.0               0.0       0.0                0.0               
FAILED_SESSION_MSG     0           0.0               0.0       0.0                0.0               
_TRACE                 0           0.0               0.0       0.0                0.0               
SCHEMA_VERSION_RSP     0           0.0               0.0       0.0                0.0               
FINALIZE_COMMIT_MSG    0           0.0               0.0       0.0                0.0               
SNAPSHOT_MSG           0           0.0               0.0       0.0                0.0               
PREPARE_CONSISTENT_RSP 0           0.0               0.0       0.0                0.0               
PAXOS_PROPOSE_REQ      0           0.0               0.0       0.0                0.0               
PAXOS_PREPARE_RSP      0           0.0               0.0       0.0                0.0               
MUTATION_REQ           0           2346.799          10090.808 17436.917          74975.55          
READ_REQ               0           0.0               0.0       0.0                0.0               
PING_RSP               0           0.0               0.0       0.0                0.0               
RANGE_RSP              0           0.0               0.0       0.0                0.0               
VALIDATION_REQ         0           0.0               0.0       0.0                0.0               
SYNC_REQ               0           0.0               0.0       0.0                0.0               
_TEST_1                0           0.0               0.0       0.0                0.0               
GOSSIP_SHUTDOWN        0           0.0               0.0       0.0                0.0               
TRUNCATE_REQ           0           0.0               0.0       0.0                0.0               
_TEST_2                0           0.0               0.0       0.0                0.0               
GOSSIP_DIGEST_ACK      0           785.939           2346.799  14530.764000000001 14530.764000000001
SCHEMA_PUSH_REQ        0           0.0               0.0       0.0                0.0               
FINALIZE_PROMISE_MSG   0           0.0               0.0       0.0                0.0               
BATCH_STORE_REQ        0           0.0               0.0       0.0                0.0               
COUNTER_MUTATION_RSP   0           0.0               0.0       0.0                0.0               
REPAIR_RSP             0           0.0               0.0       0.0                0.0               
STATUS_REQ             0           0.0               0.0       0.0                0.0               
SCHEMA_PULL_REQ        0           0.0               0.0       0.0                0.0               
READ_REPAIR_REQ        0           0.0               0.0       0.0                0.0               
REPLICATION_DONE_REQ   0           0.0               0.0       0.0                0.0               
PAXOS_PROPOSE_RSP      0           0.0               0.0       0.0                0.0

The table created with:

create keyspace example with replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 2 };
create table example.tweet(timeline text, id UUID, text text, PRIMARY KEY(id));

And the client code:

package main

import (
        "fmt"
        "strconv"
        "github.com/gocql/gocql"
        "time"
        "sync"
)

const (
  gophers = 50
  entries = 3000
)

func main() {
        var wg sync.WaitGroup
        start_time:=time.Now().UnixNano()
        for i :=0 ; i < gophers; i++ {
                wg.Add(1)
                // spin up a gopher
                go gopher(i, &wg)
        }

        wg.Wait()
       end_time := time.Now().UnixNano()

   total_time := (end_time - start_time)/1000000

   fmt.Println("total spent time: ", strconv.FormatInt(total_time, 10))
}


func gopher (thread_id int, wg *sync.WaitGroup) {
        defer wg.Done()
        cluster := gocql.NewCluster("10.30.1.1","10.30.1.2","10.30.1.3")
        cluster.ConnectTimeout = time.Second*30
        cluster.DisableInitialHostLookup=true
        cluster.Timeout = 25*time.Second
        cluster.Consistency = gocql.LocalQuorum
        cluster.Keyspace = "example"
        session, err := cluster.CreateSession()
        if err != nil {
                panic(err)
        }

        defer session.Close()

        stmt:= session.Query("INSERT INTO tweet (timeline, id, text) VALUES (?, ?, ?)")

        fmt.Println("StartTime: ", time.Now())
        for i:=0; i < entries; i ++ {
                _=stmt.Bind("me", gocql.TimeUUID(), "Hello"+strconv.Itoa(i)).Exec()
        }
        fmt.Println("EndTime:", time.Now())
}

I wonder if anyone can give me some suggestion on what else I can look for?

YHC
  • 75
  • 9
  • Writing at local quorum - as the node count increased from 1 to 2, both writes would be needed to be acknowledged, the increase from 2 to 3 wouldn't of affected the acknowledgement since it would still be 2 writes out of 3 to acknowledge - but the third still will take place and use resources. – Andrew Aug 04 '21 at 13:26
  • So what is the suggested consistency level? – YHC Aug 04 '21 at 17:43

1 Answers1

0

If you are running all the 3 VMs on the same physical host then that would invalidate your test because the 3 VMs are competing for the same physical resources.

For the test to be valid, you should host each VM on a separate physical host. Cheers!

Erick Ramirez
  • 13,964
  • 1
  • 18
  • 23