2

As per datastx documentation here, we cannot delete column from tables defined with COMPACT STORAGE option. What is the reason for this?

Aaron
  • 55,518
  • 11
  • 116
  • 132
Manish Khandelwal
  • 2,260
  • 2
  • 15
  • 13

1 Answers1

2

This goes back to the original implementation of CQL3, and changes which were made to allow it to abstract a "SQL-like," wide-row structure on top of the original Thrift-based storage engine. Ultimately, managing the schema comes down to whether or not the underlying structure is a table or a column_family.

As an example, I'll create two tables using an old install of Apache Cassandra (2.1.19):

CREATE TABLE student (
  studentid TEXT PRIMARY KEY,
  fname TEXT,
  name TEXT);

CREATE TABLE studentcomp (
  studentid TEXT PRIMARY KEY,
  fname TEXT,
  name TEXT)
WITH COMPACT STORAGE;

I'll insert one row into each table:

INSERT INTO student (studentid, fname, lname) VALUES ('janderson','Jordy','Anderson');
INSERT INTO studentcomp (studentid, fname, lname) VALUES ('janderson','Jordy','Anderson');

And then I'll look at the tables with the old cassandra-cli tool:

[default@stackoverflow] list student;
Using default limit of 100
Using default cell limit of 100
-------------------
RowKey: janderson
=> (name=, value=, timestamp=1599248215128672)
=> (name=fname, value=4a6f726479, timestamp=1599248215128672)
=> (name=lname, value=416e646572736f6e, timestamp=1599248215128672)

[default@stackoverflow] list studentcomp;
Using default limit of 100
Using default cell limit of 100
-------------------
RowKey: janderson
=> (name=fname, value=Jordy, timestamp=1599248302715066)
=> (name=lname, value=Anderson, timestamp=1599248302715066)

Do you see the empty/"ghost" column value in the first result? That empty column value was CQL3's link between the column values and the table's meta data. If it's not there, then CQL cannot be used to manage a table's columns.

The comparator used for type conversion was all that was really exposed via Thrift. This lack of meta data control/exposure is what allowed Cassandra to be considered "schemaless" in the pre-CQL days. If I run a describe studentcomp from within the cassandra-cli, I can see the comparators (validation class) used:

Column Metadata:
  Column Name: lname
    Validation Class: org.apache.cassandra.db.marshal.UTF8Type
  Column Name: fname
    Validation Class: org.apache.cassandra.db.marshal.UTF8Type

But if I try describe student, I see this:

WARNING: CQL3 tables are intentionally omitted from 'describe' output.
See https://issues.apache.org/jira/browse/CASSANDRA-4377 for details.

Sorry, no Keyspace nor (non-CQL3) ColumnFamily was found with name: student (if this is a CQL3 table, you should use cqlsh instead)

Bascially, tables and column families were different entities forced into the same bucket. Adding WITH COMPACT STORAGE essentially made a table a column family. With that came the lack of any schema management (adding or removing columns), outside of access to the comparators.

Edit 20200905

Can we somehow / someway (hack) drop the columns from table?

You might be able to accomplish this. Sylvain Lebresne wrote A Thrift to CQL3 Upgrade Guide which will have some necessary details for you. I also advise reading through the Jira ticket mentioned above (CASSANDRA-4377), as that covers many of the in-depth technical challenges that make this difficult.

Aaron
  • 55,518
  • 11
  • 116
  • 132
  • Can we somehow / someway (hack) drop the columns from table. I have old backup of my system tables before adding the column. I was thinking of shutting down my all the nodes and restoring old sstables from backup in schema_columns directory. But it did not work? – Manish Khandelwal Sep 05 '20 at 07:09
  • @ManishKhandelwal Edit made. Good luck! – Aaron Sep 05 '20 at 12:56
  • Did try it(rollback of system tables) and succeded but I am apprehensive to apply in production. I am thinking to drop the table and recreate with same name and insert the data. – Manish Khandelwal Sep 06 '20 at 18:02