The best way to get table statues like total row numbers and current workload like op/s of Cassandra?

Question

I am trying two things about our Cassandra based application, which is not possible to stop all service for testing purpose:

Test the performance like op/s and 99.9% latency using python driver. To get more accurate result, we want to know current workload such as read and write op/s of Cassandra.
Get some information like total number rows a table contains(our table have almost 8 billion record for now) & how many records inserted in every week(there are some data source that we cannot control, so it's hard to get this information from insert script directly).

I have tried some methods for these two problems:

Updated in comments.
select count(*) from xxx does not work at all, and it is too slow. I tried to get some information using nodetool tablestats, take system_distributed for example:

Keyspace : system_distributed
    Read Count: 0
    Read Latency: NaN ms
    Write Count: 0
    Write Latency: NaN ms
    Pending Flushes: 0
        Table: parent_repair_history
        SSTable count: 0
        Space used (live): 0
        Space used (total): 0
        Space used by snapshots (total): 0
        Off heap memory used (total): 0
        SSTable Compression Ratio: -1.0
        Number of partitions (estimate): 0
        Memtable cell count: 0
        Memtable data size: 0
        Memtable off heap memory used: 0
        Memtable switch count: 0
        Local read count: 0
        Local read latency: NaN ms
        Local write count: 0
        Local write latency: NaN ms
        Pending flushes: 0
        Percent repaired: 100.0
        Bloom filter false positives: 0
        Bloom filter false ratio: 0.00000
        Bloom filter space used: 0
        Bloom filter off heap memory used: 0
        Index summary off heap memory used: 0
        Compression metadata off heap memory used: 0
        Compacted partition minimum bytes: 0
        Compacted partition maximum bytes: 0
        Compacted partition mean bytes: 0
        Average live cells per slice (last five minutes): NaN
        Maximum live cells per slice (last five minutes): 0
        Average tombstones per slice (last five minutes): NaN
        Maximum tombstones per slice (last five minutes): 0
        Dropped Mutations: 0
        Table: repair_history
        SSTable count: 0
        Space used (live): 0
        Space used (total): 0
        Space used by snapshots (total): 0
        Off heap memory used (total): 0
        SSTable Compression Ratio: -1.0
        Number of partitions (estimate): 0
        Memtable cell count: 0
        Memtable data size: 0
        Memtable off heap memory used: 0
        Memtable switch count: 0
        Local read count: 0
        Local read latency: NaN ms
        Local write count: 0
        Local write latency: NaN ms
        Pending flushes: 0
        Percent repaired: 100.0
        Bloom filter false positives: 0
        Bloom filter false ratio: 0.00000
        Bloom filter space used: 0
        Bloom filter off heap memory used: 0
        Index summary off heap memory used: 0
        Compression metadata off heap memory used: 0
        Compacted partition minimum bytes: 0
        Compacted partition maximum bytes: 0
        Compacted partition mean bytes: 0
        Average live cells per slice (last five minutes): NaN
        Maximum live cells per slice (last five minutes): 0
        Average tombstones per slice (last five minutes): NaN
        Maximum tombstones per slice (last five minutes): 0
        Dropped Mutations: 0
        Table: view_build_status
        SSTable count: 0
        Space used (live): 0
        Space used (total): 0
        Space used by snapshots (total): 0
        Off heap memory used (total): 0
        SSTable Compression Ratio: -1.0
        Number of partitions (estimate): 0
        Memtable cell count: 0
        Memtable data size: 0
        Memtable off heap memory used: 0
        Memtable switch count: 0
        Local read count: 0
        Local read latency: NaN ms
        Local write count: 0
        Local write latency: NaN ms
        Pending flushes: 0
        Percent repaired: 100.0
        Bloom filter false positives: 0
        Bloom filter false ratio: 0.00000
        Bloom filter space used: 0
        Bloom filter off heap memory used: 0
        Index summary off heap memory used: 0
        Compression metadata off heap memory used: 0
        Compacted partition minimum bytes: 0
        Compacted partition maximum bytes: 0
        Compacted partition mean bytes: 0
        Average live cells per slice (last five minutes): NaN
        Maximum live cells per slice (last five minutes): 0
        Average tombstones per slice (last five minutes): NaN
        Maximum tombstones per slice (last five minutes): 0
        Dropped Mutations: 0

These have some parameters that I cannot understand:

a. What does Local write count mean? If I have a table distributed on different nodes and have multiple replica, how to calculate how many rows of that table?

b. Do the first 5 lines (Read Count, Write Count) describe information of that keyspace(system_distributed)?

c. Does all latency here mean average latency？

Appreciate if you guys could give me any suggestion.

Jiashi

Update: I have fixed stats reading bug, still use `scales.getStats()['cassandra-0']`, it works well now. But I still have a question: does scales measure latency of only one process's requests, or it will measure latency of all requests? — Whichname, Jul 10 '19 at 06:55
I find some logs printed later have less max latency than the first log(printed earlier by one process), so I guess it might only measure one process's requests. — Whichname, Jul 10 '19 at 07:01

The best way to get table statues like total row numbers and current workload like op/s of Cassandra?

0 Answers0