8

I am planning to use Datastax Java driver for writing to Cassandra.. I was mainly interested in Batch Writes and Asycnhronous features of Datastax java driver but I am not able to get any tutorials which can explain me how to incorporate these features in my below code which uses Datastax Java driver..

/**
 * Performs an upsert of the specified attributes for the specified id.
 */
public void upsertAttributes(final String userId, final Map<String, String> attributes, final String columnFamily) {

    try {

        // make a sql here using the above input parameters.

        String sql = sqlPart1.toString()+sqlPart2.toString();

        DatastaxConnection.getInstance();
        PreparedStatement prepStatement = DatastaxConnection.getSession().prepare(sql);
        prepStatement.setConsistencyLevel(ConsistencyLevel.ONE);        

        BoundStatement query = prepStatement.bind(userId, attributes.values().toArray(new Object[attributes.size()]));

        DatastaxConnection.getSession().execute(query);

    } catch (InvalidQueryException e) {
        LOG.error("Invalid Query Exception in DatastaxClient::upsertAttributes "+e);
    } catch (Exception e) {
        LOG.error("Exception in DatastaxClient::upsertAttributes "+e);
    }
}

In the below code, I am creating a Connection to Cassandra nodes using Datastax Java driver.

/**
 * Creating Cassandra connection using Datastax Java driver
 *
 */
private DatastaxConnection() {

    try{
        builder = Cluster.builder();
        builder.addContactPoint("some_nodes");

        builder.poolingOptions().setCoreConnectionsPerHost(
                HostDistance.LOCAL,
                builder.poolingOptions().getMaxConnectionsPerHost(HostDistance.LOCAL));

        cluster = builder
                .withRetryPolicy(DowngradingConsistencyRetryPolicy.INSTANCE)
                .withReconnectionPolicy(new ConstantReconnectionPolicy(100L))
                .build();

        StringBuilder s = new StringBuilder();
        Set<Host> allHosts = cluster.getMetadata().getAllHosts();
        for (Host h : allHosts) {
            s.append("[");
            s.append(h.getDatacenter());
            s.append(h.getRack());
            s.append(h.getAddress());
            s.append("]");
        }
        System.out.println("Cassandra Cluster: " + s.toString());

        session = cluster.connect("testdatastaxks");

    } catch (NoHostAvailableException e) {
        e.printStackTrace();
        throw new RuntimeException(e);
    } catch (Exception e) {

    }
}

Can anybody help me on how to add Batch writes or Asynchronous features to my above code.. Thanks for the help..

I am running Cassandra 1.2.9

arsenal
  • 23,366
  • 85
  • 225
  • 331

2 Answers2

8

For asynch it's as simple as using the executeAsync function:

...
DatastaxConnection.getSession().executeAsync(query);

For the batch, you need to build the query (I use strings because the compiler knows how to optimize string concatenation really well):

String cql =  "BEGIN BATCH "
       cql += "INSERT INTO test.prepared (id, col_1) VALUES (?,?); ";
       cql += "INSERT INTO test.prepared (id, col_1) VALUES (?,?); ";
       cql += "APPLY BATCH; "

DatastaxConnection.getInstance();
PreparedStatement prepStatement = DatastaxConnection.getSession().prepare(cql);
prepStatement.setConsistencyLevel(ConsistencyLevel.ONE);        

// this is where you need to be careful
// bind expects a comma separated list of values for all the params (?) above
// so for the above batch we need to supply 4 params:                     
BoundStatement query = prepStatement.bind(userId, "col1_val", userId_2, "col1_val_2");

DatastaxConnection.getSession().execute(query);

On a side note, I think your binding of the statement might look something like this, assuming you change attributes to a list of maps where each map represents an update/insert inside the batch:

BoundStatement query = prepStatement.bind(userId,
                                          attributesList.get(0).values().toArray(new Object[attributes.size()]), 
                                          userId_2,
                                          attributesList.get(1).values().toArray(new Object[attributes.size()])); 
Rui Vieira
  • 5,253
  • 5
  • 42
  • 55
Lyuben Todorov
  • 13,987
  • 5
  • 50
  • 69
  • Is there a way to do this with named parameters? – Highstead Mar 30 '14 at 14:55
  • 1
    @Highstead What programming language? The above is java so ([sort of no](http://java.dzone.com/articles/named-parameters-java)) – Lyuben Todorov Mar 30 '14 at 15:02
  • I was focused on python but I assumed if there was a way to do it in one there would be a way to do it in the other. The old cql driver supports it but has been deprecated. So i was looking to replace the functionality. – Highstead Mar 31 '14 at 14:38
  • @Highstead Python = yes to named parameters, [example here](https://github.com/twissandra/twissandra/blob/master/cass.py#L103) using the newer python DataStax driver. – Lyuben Todorov Mar 31 '14 at 17:22
  • Is that done server side or client side? I'm inclined to guess client side with the %(p_name)s syntax. – Highstead Mar 31 '14 at 18:20
  • It is a client driver so it is client side. You do this in your own code, which then pushes the new data into your Cassandra server. – Lyuben Todorov Mar 31 '14 at 19:34
  • Concern is mostly about injection or is this something I'm responsible for with cassandra? – Highstead Mar 31 '14 at 22:24
  • Be specific, do you mean code injection, SQL injection? For the latter you want to use [prepared statements](http://www.datastax.com/documentation/developer/python-driver/1.0/python-driver/quick_start/qsSimpleClientBoundStatements_t.html). You can't inject code via cql3. – Lyuben Todorov Apr 01 '14 at 09:44
  • You're using string queries, but is there a way to use the QueryBuilder and prepare/bind a few statements and then batch-execute them? To my understanding so far this is not possible... – VHristov Apr 29 '14 at 13:53
  • @LyubenTodorov I also have similar question [here](https://stackoverflow.com/questions/26265224/how-to-efficiently-use-batch-writes-to-cassandra-using-datastax-java-driver). If possible, can you help me out? – john Oct 08 '14 at 21:24
6

For the example provided in Lyuben's answer, setting certain attributes of a batch like Type.COUNTER (if you need to update counters) using strings won't work. Instead you can arrange your prepared statements in batch like so:

final String insertQuery = "INSERT INTO test.prepared (id, col_1) VALUES (?,?);";
final PreparedStatement prepared = session.prepare(insertQuery);

final BatchStatement batch = new BatchStatement(BatchStatement.Type.UNLOGGED);
batch.add(prepared.bind(userId1, "something"));
batch.add(prepared.bind(userId2, "another"));
batch.add(prepared.bind(userId3, "thing"));

session.executeAsync(batch);
cfeduke
  • 23,100
  • 10
  • 61
  • 65
  • 1
    I like this better than the accepted answer. Here contents of the batch can be dynamic (vs. fixed CQL & number of arguments in the accepted answer) – 0cd Jul 25 '16 at 23:31
  • I believe this is bad code (as of 2019). BatchStatement is immutable. You need to batch = batch.add(... – Tony Schwartz Oct 10 '19 at 21:51