How to Insert into cassandra table using select from another table in cassandra?

Question

insert into sys.new_table select id + (select max(id) from sys.Old_table),name from sys.Old_table;

By this we can able to insert data from one table to another table in Oracle. How can i write this query in Cassandra?

Old_table
    ID,Case Number,Date
    8534426,HV210935,03/19/2012 12:00:00 PM
    8534427,HV210768,12/16/2011 04:30:00 AM

How can I insert data into new_table with new_table.ID = Max(Old_table.ID)+Old_table.ID and other data as on Old_table using Cassandra? I can do the insertion using above syntax in mysql.

new_table
    ID,Case Number,Date
    8534428,HV210935,03/19/2012 12:00:00 PM
    8534429,HV210768,12/16/2011 04:30:00 AM

Please do suggest me if this can be solved using Spark as well.

score 0 · Answer 1 · answered May 29 '17 at 11:08

This can be done using spark-cassandra connector.

Basic thing to do.

Get data from oldTable.
Get max id from the dataframe
create new dataframe using old dataframe. Note .withColumn should have same column name id

Example Code using scala:

val oldTable = sc.read.formt("org.apache.spark.sql.cassandr")
                 .options(Map("keyspace"->"sys","table"->"Old_table"))
                 .load()

val maxId = oldTable.select(max("id")).collect()(0).getAs[Int](0)

val newTable = oldTable.withColumn("id",lit(maxId).plus(col("id")))

newTable.write.format("org.apache.spark.sql.cassandr")
        .options(Map("keyspace"->"sys","table"->"new_table"))
        .save()

This is just a example code, where sc is SQLContext/HiveContext.

Based on your data size you can use .cache() on oldTable.. etc.

Modify the code based on your requirement.

How to Insert into cassandra table using select from another table in cassandra?

1 Answers1