I want to create a table with fixed number of rows (lets say N), where if N+1th row was added, then 1st row would be removed.
This is the table, I use for storage of last N best results from graph analysis:
CREATE TABLE IF NOT EXISTS lp_registry.best (
value float, // best value for current graph
verts int, // number of vertices in graph
edges int, // number of edges in graph
wid text, // worker id
id timeuuid, // timeuuid
PRIMARY KEY (wid, id)
) WITH CLUSTERING ORDER BY (id ASC);
I've read about expiring data at DataStax, but found only TTL expirations. So I decided to do it in following way.
My Approach A:
Everytime a new result is wanted to be added, id of oldest row is retrieved..
SELECT wid, id FROM lp_registry.best LIMIT 1;
..as well as current number of rows..
SELECT COUNT(*) FROM FROM lp_registry.best;
Consequently if count >= N, then the oldest row is removed and the newest is added...
BEGIN BATCH
INSERT INTO lp_registry.best (value, verts, edges, wid, id) VALUES (?, ?, ?, ? now());
DELETE FROM lp_registry.best WHERE wid = ? AND id = ?;
APPLY BATCH;
This approach has problem with that first selects are not atomic operations together with the following batch. So if any other worker deleted oldest row between select and batch, or N was exceeded, then this wouldn't work.
My Approach B:
Same first steps ...
SELECT wid, id FROM lp_registry.best LIMIT 1;
SELECT COUNT(*) FROM FROM lp_registry.best;
Then try to delete oldest row again and again until success..
if count < N {
INSERT INTO lp_registry.best (value, verts, edges, wid, id) VALUES (?, ?, ?, ? now());
} else {
while not success {
DELETE FROM lp_registry.best WHERE wid = ? AND id = ? IF EXISTS;
}
INSERT INTO lp_registry.best (value, verts, edges, wid, id) VALUES (?, ?, ?, ? now());
}
In this approach there is still trouble with exceeding N in the database, before count < N is checked.
Can you point me to the right solution?