0

I am trying to list the column families in Cassandra using the Astyanax driver. It lists the keyspaces OK, but many column families are missing from the output.

I have a simple program for this:

import com.netflix.astyanax.AstyanaxContext;
import com.netflix.astyanax.Cluster;
import com.netflix.astyanax.connectionpool.impl.ConnectionPoolConfigurationImpl;
import com.netflix.astyanax.connectionpool.impl.CountingConnectionPoolMonitor;
import com.netflix.astyanax.ddl.ColumnFamilyDefinition;
import com.netflix.astyanax.ddl.KeyspaceDefinition;
import com.netflix.astyanax.impl.AstyanaxConfigurationImpl;
import com.netflix.astyanax.thrift.ThriftFamilyFactory;

public class App {

  public static void main(String[] args) throws Exception {

    ConnectionPoolConfigurationImpl cpool = new ConnectionPoolConfigurationImpl("ConnectionPool")
        .setPort(9160)
        .setSeeds("localhost");

    AstyanaxConfigurationImpl astyanaxConfiguration = new AstyanaxConfigurationImpl();
    AstyanaxContext.Builder ctxBuilder = new AstyanaxContext.Builder();
    ctxBuilder.forCluster("Cluster")
        .withAstyanaxConfiguration(astyanaxConfiguration)
        .withConnectionPoolConfiguration(cpool)
        .withConnectionPoolMonitor(new CountingConnectionPoolMonitor());

    AstyanaxContext<Cluster> clusterContext = ctxBuilder.buildCluster(ThriftFamilyFactory.getInstance());
    clusterContext.start();
    Cluster cluster = clusterContext.getClient();

    for (KeyspaceDefinition ksDef : cluster.describeKeyspaces()) {
      List<ColumnFamilyDefinition> cfDefList = ksDef.getColumnFamilyList();
      System.out.println("there are " + cfDefList.size() + " column families in keyspace " + ksDef.getName());
      for (ColumnFamilyDefinition cfDef : cfDefList) System.out.println(" - " + cfDef.getName());
    }

It can list the keyspaces, but many of the column families are missing. here is the output. It can be seen that many of the default keyspaces are there, but many of the column families are missing.

there are 0 column families in keyspace system_distributed
there are 3 column families in keyspace system
- hints
- schema_keyspaces
- IndexInfo
there are 2 column families in keyspace system_auth
- role_members
- resource_role_permissons_index
there are 0 column families in keyspace system_traces

I can use cqlsh to confirm that the column families do exist

cqlsh> DESCRIBE COLUMNFAMILIES

Keyspace system_traces
----------------------
events  sessions

Keyspace system_auth
--------------------
resource_role_permissons_index  role_permissions  role_members  roles

Keyspace system
---------------
available_ranges  size_estimates    schema_usertypes    compactions_in_progress
range_xfers       peers             paxos               schema_aggregates
schema_keyspaces  schema_triggers   batchlog            schema_columnfamilies
schema_columns    sstable_activity  schema_functions    local
"IndexInfo"       peer_events       compaction_history  hints

Keyspace system_distributed
---------------------------
repair_history  parent_repair_history

The output above is using cassandra 2.2, but I have confirm the behaviour in other versions of cassandra and scylla.

albertlockett
  • 204
  • 3
  • 13

3 Answers3

5

Thrift is deprecated and no longer enabled by Cassandra by default. You would need to enable it to use it. Keep in mind that in next versions of Cassandra it doesnt even exist. Its very likely that even enabling it and using it things might not work quite right. Nothing really uses it anymore and there are not as many tests that use it. How the tables are stored and retrieved changed between versions so the driver has to be aware of that. As Astyanax isn't maintained it likely wont have it right.

Astyanax has been retired and is only around for older apps. You should really use the java driver (same recommendation as on Astyanax page).

Chris Lohfink
  • 16,150
  • 1
  • 29
  • 38
  • Hey Chris. Thanks yeah I heard thrift was being deperacted. I'm actually using another DB that uses cassandra as it's storage solution (janusgraph). Some of their code uses the thrift API to talk to the DB. The program I wrote is made of snippets I took from that code base. – albertlockett Oct 03 '18 at 19:38
  • I tried it using an older version of Cassandra (2.0 and 2.2) which were maintained around the same time as the astyanax driver and it had the same behaviour – albertlockett Oct 03 '18 at 19:39
  • Astyanax was really for using like cassandra 1.0, 1.1 or 1.2. I think 2.0 support was in beta when it basically quit trying to keep up as netflix migrated to java driver. Try 1.2 but honestly, have low expectations. I am sure janusgraph has some kind of cql support if they use cassnadra. – Chris Lohfink Oct 03 '18 at 20:22
4

I remember using Astyanax with Cassandra 0.8.8 - 1.1 and 1.2. There was a time where we would shove all the data (columns) into one partition as a blob (enabled faster writes) and then we would parse the data out from the thrift fat client(allegedly Cassandra was slow for reads at this time). We had to keep track of the schema and then while deserializing the output from the thrift client we would parse based on the data-type of all the individual columns. All of this changed after the introduction of CQL and as Chris is pointing out this is the recommended way of working with c*.

0

What versions of Scylla did you try it with? Thrift, though deprecated with Cassandra, is still supported in Scylla.

Peter Corless
  • 781
  • 3
  • 12
  • 1
    Hey Peter, thanks for the follow-up. I tried this with Scylla 2.1.0 and 2.1.3 – albertlockett Oct 12 '18 at 00:02
  • Interesting! Did it show the same results on Scylla 2.3 (current release)? (It may likely result in the same disappointing results, but best rule out a fixed bug.) If you get no joy, consider posting to the group https://groups.google.com/forum/#!forum/scylladb-users – Peter Corless Oct 16 '18 at 20:54
  • 1
    Hey Peter. We found out what was going on. We were creating the tables using CQL, and then trying to read them as column families using thrift. Once we swapped it to init to create the column families with CQL, it worked OK – albertlockett Oct 17 '18 at 20:15
  • Woah! Good to know! Thanks. – Peter Corless Oct 19 '18 at 15:25