Summary
Our team inherited this sequence generator implemented using Cassandra;
Table
CREATE TABLE IF NOT EXISTS sequences (
id_name varchar,
next_id bigint,
instance_name varchar,
PRIMARY KEY (id_name)
)WITH COMPRESSION = { ... };
GET_LOCK("UPDATE sequences USING TTL 10 set instance_name = ? where id_name = ? IF instance_name = null", ConsistencyLevel.LOCAL_QUORUM),
SELECT_SEQUENCE("SELECT next_id from sequences where id_name = ?",
ConsistencyLevel.LOCAL_QUORUM)
UPDATE_SEQUENCE("UPDATE sequences SET next_id= ? where id_name= ? IF next_id= ?",ConsistencyLevel.LOCAL_QUORUM),
REMOVE_LOCK("UPDATE sequences set instance_name = null where id_name = ? IF instance_name = ?", ConsistencyLevel.LOCAL_QUORUM);
(note: ConsistencyLevel was set to LOCAL_SERIAL in Java)
it was running fine until yesterday, we found two different java App nodes got the same sequence number
Time stamps when this happened
AppNode 1
getlock: 4:25:14.480
UpdateSequence: 4:25:14.486
AppNode 2
getlock: 4:25:14,489
UpdateSequence: 4:25:14,496
How can this happen? How can we find out what exactly happened?