0

I'm writing a Spark-based app and have to drop some tables in Cassandra DB.

I know how to read from tables with spark.read.format("jdbc"). I know how to save dataframe with df.write.format("jbdc").

But how can I drop a table that I don't need anymore?

Erick Ramirez
  • 13,964
  • 1
  • 18
  • 23
Felix
  • 3,351
  • 6
  • 40
  • 68

1 Answers1

1

To drop a Cassandra table, you can simply use the Spark SQL DROP TABLE command:

spark.sql("DROP TABLE table_name")

Note that the JDBC API is limited so our general recommendation is to use the Spark Cassandra connector. It is fully open-source so is free to use.

The Spark Cassandra connector is a library specifically designed for connecting to Cassandra clusters from Spark applications. Cassandra tables are exposed as DataFrames or RDDs and the connector also allows execution of the full CQL API. Cheers!

Erick Ramirez
  • 13,964
  • 1
  • 18
  • 23
  • Well, I'm trying to do `spark.sql("DROP TABLE default.persons__20221006_133524")`, but i'm getting an error: 'Table or view not found: persons__20221006_133524;'. But I can still read this table with `spark.read.format("org.apache.spark.sql.cassandra")`. Maybe I need to setup Spark session with some additional options to see Cassandra tables from Spark SQL API? – Felix Feb 07 '23 at 07:59
  • I wonder if it's because the JDBC API doesn't support it. If you update your original question with configuration + minimal code that replicates the problem, I'd be happy to look at it again. Cheers! – Erick Ramirez Feb 07 '23 at 08:24