I have cassandra table containing information about ship identities. It has column: imo_number, mmsi_number, ship_name, timestamp.
I have 2 requirement for this table:
- I want this table to only save row with last updated records, according to column imo_number, mmsi_number, and ship_name. So if any new record with value of those column is exacly same, it will be updated. As I know, I need to define those column as primary key in cassandra.
create table keyspace.table (
mmsi_number text,
imo_number text,
ship_name text,
timestamp timestamp,
primary key ((mmsi_number), imo_number, ship_name)
);
- I want to be able to load data from table by mmsi as where clause, and sort it by timestamp to find the latest record, or find records between a date range. As I know, the CQL schema will be:
create table keyspace.table (
mmsi_number text,
imo_number text,
ship_name text,
timestamp timestamp,
primary key ((mmsi_number), timestamp, imo_number, ship_name)
) with clustering order by (timestamp desc);
The problem is, when I use my second schema, my first requirement will not be fulfilled. Because every new record will have different timestamp, so it will be inserted instead updated.
How do I fulfilled my requirements above? Or maybe I did something wrong? Help appreciated.