2

I have 12 mixed Ubuntu 12/14 database Cassandra nodes. All nodes are baremetal nodes with SSDs, 1Gb network cards and are all colocated in the same DC (managed colo).

Under light operation, the latency between all the nodes and our cloud nodes (in the same DC as well) are all under 1ms.

When I start ramping up writes to the database nodes, the latencies to and from these database nodes climbs heavily to around 300ms. CPU load is also around 1 (4 physical cores), disk utilization is below 3%, and via dstat, the network load is around 18MiB.

Local reads & writes to Cassandra are relatively quick, so I've ruled out the application layer being overloaded.

What tools and settings should I be looking into tuning to get an understanding to why my latency is so bad? I have monitoring tools in place to view these issues, I'm unsure where to start in diagnosing them.

killachaos
  • 135
  • 4
  • A quick Google search turned up this entry http://www.datastax.com/dev/blog/tools-for-testing-cassandra as well as the local Cassandra monitoring tools at http://wiki.apache.org/cassandra/Operations#Monitoring. Sounds like a logical place to start. – Thomas N Mar 17 '16 at 21:36
  • I'm pretty sure the issues aren't related to Cassandra. I don't have the operational resources to conduct a full test, but under network load, *every* network related function seems to slow down, not just Cassandra. – killachaos Mar 19 '16 at 00:13
  • If you're certain that the issue's are network-related, there are a lot of tuning parameters that can be brought to bear. http://www.nateware.com/linux-network-tuning-for-2013.html#.VvAQARhD4bo has general sysctl tuning parameters and advice on how/when to use them. But again, I would be absolutely certain that it's a network issue before starting down this path. – Thomas N Mar 21 '16 at 15:23

1 Answers1

0

My starting point for issues like this is usually perf top. This will quickly give you an idea where the largest amounts of time are spent. See https://perf.wiki.kernel.org/index.php/Tutorial for some nice examples on how to use it.

al.
  • 925
  • 6
  • 17