1

We need to benchmark, the time taken for cluster to balance when you by varying the load say 100GB, 200GB 300GB. Does someone know how to inject data of these varying sizes using cassandra stress tool?

P.S. I have been using this command to load specific number of rows-- cassandra-stress -d 192.168.127.48,192.168.127.44,192.168.127.47 -l 3 -n 10000 -o INSERT

However, this doesnt determine the size of data to be loaded. Thanks Smitha

  • Try a REST-interface for Cassandra like [hmsonline/cassandra-rest](https://github.com/hmsonline/cassandra-rest). I used something similar (no source, sorry) from python a couple of years back. – Henk Langeveld May 19 '15 at 08:53

1 Answers1

0

You can define your own schema for the stress test where you can have more granular control on the size of the data being generated per row. See the section on Column Distributions here. You can define different size distributions for columns, you would probably want to do fixed distribution.

jny
  • 8,007
  • 3
  • 37
  • 56