I am new to apache-cassandra and i am planning to use it as the data repository of a new project for its write performance. I have setup a cassandra cluster with three nodes and replication factor 3. My program A uses datastax's cassandra-driver-core 2.1.7 to write and read data from cassandra. Each execution of the program writes about 50 records into cassandra using batch statement. Test of the single execution shows no problem at all. However, when i start to run A in a more intensive way, problem occurs.
Details are as follows: An other program B calls A 40 times within 10 seconds, so there should be 2k records in cassandra after B finishes executing. However, the number of records written to cassandra was only 25-30% (varies randomly in each run of B) of the 2k records. I was using cqlsh to check the number of records written, by the way. I need to re-run B several times so that eventually all 2k records can be written into cassandra.
I have totally no clue now, there was no error reported in the execution of both A and B, and from log, A did got executed 40 times.
I don't know if this is connected to cluster set up, consistency level setting,etc, or if there's any tuning i need to do to take care of higher frequency writing.
The code is something like :
String query = "insert into A (a,b,c,d,e,f) values (?,?,?,?,?,?)";
PreparedStatement p = session.prepare(query);
BatchStatement b = new BatchStatement();
for (int i=0; i<50; i++) {
BoundStatement b1 = p.bind();
b1.setInt("a",A);
...
b1.setInt("f",F);
b.add(b1);
}
session.execute(b);
Any help would be greatly appreciated!
Addition:
I changed my code not to use batch statement as @aaron and others suggested. The problem still remains, not all records were written into cassandra (i mean i cannot see them using cqlsh's select statement). After a while, i noticed that problem only occured to those records that have previously been inserted (removed before being inserted again using delete cqlsh statement). If the records have never been inserted before , correct results were shown using cqlsh's "select * from ". Can anyone enlighten me why this is so and if there's a way to avoid this from happening ? Thanks a lot.