I need to insert the huge amount of data by using Python DataStax driver for Cassandra. As a result I cannot use execute( ) request. execute_async( ) is much faster.
But I faced the problem of losing data during calling execute_async( ). If I use execute( ), everything is ok. But, If I use execute_async( ) (for the SAME insert queries), the only about 5-7% of my request executed correctly (and no any errors were occured). And in a case I add time.sleep( 0.01 ) after each of 1000 insert request (by using execute_async( ) ), it's again ok.
No any data lose (case 1):
for query in queries:
session.execute( query )
No any data lose (case 2):
counter = 0
for query in queries:
session.execute_async( query )
counter += 1
if counter % 1000 == 0:
time.sleep( 0.01 )
Data losing:
for query in queries:
session.execute_async( query )
Is there any reason why it could be?
Cluster has 2 nodes
[cqlsh 5.0.1 | Cassandra 3.11.2 | CQL spec 3.4.4 | Native protocol v4]
DataStax Python driver version 3.14.0
Python 3.6