-3

We are using Datastax Cassandra java driver (version 3.x). There is a logged batch Select statement with 'IN' clause, having a high number of values. Due to which we are facing a serious issue of low performance. Following is the format of query visible while debugging the Java application:

SELECT COL1, COL2, ... FROM XXXX WHERE PARTITIONKEY IN () AND CLUSTERINGKEY IN();

Can anyone please share how such SELECT with more than one IN clause shall be handled when there is high number of values available to pass inside it.

Does Session#executeAsync can solve this issue.

Thanks.

Horia
  • 2,942
  • 7
  • 14
Himanshu Singh
  • 199
  • 3
  • 15
  • Query have high no of values inside IN clause : SELECT COL1, COL2, ... FROM XXXX WHERE PARTITIONKEY IN (val1, val2... val N) AND CLUSTERINGKEY IN(val1, val2... val N); Both IN clauses are having same no of values – Himanshu Singh Mar 16 '18 at 00:27
  • Update your question instead of commenting. – Sam Orozco Mar 16 '18 at 00:30

1 Answers1

0

Don't use IN query for partition key (For a limited number of fixed data you can use if performance is not an issue). It imposes a lot of work to Coordinator node. You can use IN for clustering key but make sure your list is not too large as well.

executeAsync is the best approach here. I am adding a code snippet here.

PreparedStatement getInfo = session.prepare("SELECT COL1, COL2, ... FROM XXXX WHERE PARTITIONKEY = ?");

        List<ResultSetFuture> futures = new ArrayList<>();
        for (Object key : list) {
            ResultSetFuture future = session.executeAsync(getInfo(key));
            futures.add(future);
        }
        for (ResultSetFuture future : futures) {
            try {
                ResultSet rs = future.getUninterruptibly();
                Row rw = rs.one();
                if (rw != null) {
                    // set DB info into list or DTO 
                }
            } catch (Exception e) {
                // print log
                LOGGER.error("", e);
            }
        }

It's a sample code. Please read this link for more details:

Cassandra Query Patterns: Not using the β€œin” query for multiple partitions.

Chaity
  • 1,348
  • 13
  • 20