7

I'm getting this warning in the log:

WARN [Native-Transport-Requests:17058] 2014-07-29 13:58:33,776 BatchStatement.java (line 223) Batch of prepared statements for [keyspace.tablex] is of size 10924, exceeding specified threshold of 5120 by 5804.

Is there a way in spring data cassandra to specify the size?

Cassandra 2.0.9 and spring data cassandra 1.0.0-RELEASE

Martin Schröder
  • 4,176
  • 7
  • 47
  • 81
Oggie
  • 387
  • 2
  • 5
  • 15

2 Answers2

12

This is just a warning, informing you that the query size exceeds certain limit.

The query is still being processed. The reasoning behind is that bigger batched queries are expensive and may cause cluster imbalance. Therefore warning you (the developer) beforehand.

Look for batch_size_warn_threshold_in_kb in cassandra.yaml to adjust when should this warning be produced.

Here is the ticket where it was introduced: https://issues.apache.org/jira/browse/CASSANDRA-6487

Viliam
  • 4,404
  • 3
  • 28
  • 30
1

I have done extensive performance testing and tuning on Cassandra, working closely withe DataStax Support.

That is why I created the ingest() methods in SDC*, which are super fast in 1.0.4.RELEASE and higher.

This method caches the PreparedStatement for you, and then loops over the individual Bind values and calls executeAsync for each insert. This sounds counter intuitive, but is the fastest (and most balanced) way to insert into Cassandra.

David Webb
  • 482
  • 4
  • 11
  • But does this mean we're ok with the warning, or can we make adjustments to the code to stay below the threshold? I don't think it's wise to just start increasing the threshold in the cassandra.yaml just to get rid of the message. – Oggie Sep 30 '14 at 14:26
  • What's the best way to change the batch size? Just limit the number of items in the array to the call to template.insertAsynchronously()? – Oggie Sep 30 '14 at 14:55
  • What's the appropriate equivalent method to use on a query? I'm seeing a substantial number of batch size warnings and slow performance using CassandraOperations.stream() w/ Cassandra 3.0.7 and SDC* 1.5.0.M1 – Jeffrey Zampieron Oct 01 '16 at 15:06