I have 12 mixed Ubuntu 12/14 database Cassandra nodes. All nodes are baremetal nodes with SSDs, 1Gb network cards and are all colocated in the same DC (managed colo).
Under light operation, the latency between all the nodes and our cloud nodes (in the same DC as well) are all under 1ms.
When I start ramping up writes to the database nodes, the latencies to and from these database nodes climbs heavily to around 300ms. CPU load is also around 1 (4 physical cores), disk utilization is below 3%, and via dstat, the network load is around 18MiB.
Local reads & writes to Cassandra are relatively quick, so I've ruled out the application layer being overloaded.
What tools and settings should I be looking into tuning to get an understanding to why my latency is so bad? I have monitoring tools in place to view these issues, I'm unsure where to start in diagnosing them.