3

How is it possible to mark a row in a ksql table for deletion via Rest api or at least as a statement in ksqldb-cli?

CREATE TABLE movies (
      title VARCHAR PRIMARY KEY,
      id INT,
      release_year INT
    ) WITH (
      KAFKA_TOPIC='movies',
      PARTITIONS=1,
      VALUE_FORMAT = 'JSON'
    );

INSERT INTO MOVIES (ID, TITLE, RELEASE_YEAR) VALUES (48, 'Aliens', 1986);

This doesn't work for obvious reasons, but DELETE statement doesn't exist in ksqldb:

INSERT INTO MOVIES (ID, TITLE, RELEASE_YEAR) VALUES (48, null, null);

Is there a way to create a recommended tombstone null value or do I need to write it directly to the underlying topic?

Kubus
  • 677
  • 6
  • 18

1 Answers1

9

There is a way to do this that's a bit of a workaround. The trick is to use the KAFKA value format to write a tombstone to the underlying topic.

Here's an example, using your original DDL.

-- Insert a second row of data
INSERT INTO MOVIES (ID, TITLE, RELEASE_YEAR) VALUES (42, 'Life of Brian', 1986);

-- Query table
ksql> SET 'auto.offset.reset' = 'earliest';

ksql> select * from movies emit changes limit 2;
+--------------------------------+--------------------------------+--------------------------------+
|TITLE                           |ID                              |RELEASE_YEAR                    |
+--------------------------------+--------------------------------+--------------------------------+
|Life of Brian                   |42                              |1986                            |
|Aliens                          |48                              |1986                            |
Limit Reached
Query terminated

Now declare a new stream that will write to the same Kafka topic using the same key:

CREATE STREAM MOVIES_DELETED (title VARCHAR KEY, DUMMY VARCHAR) 
  WITH (KAFKA_TOPIC='movies', 
       VALUE_FORMAT='KAFKA');

Insert a tombstone message:

INSERT INTO MOVIES_DELETED (TITLE,DUMMY) VALUES ('Aliens',CAST(NULL AS VARCHAR));

Query the table again:

ksql> select * from movies emit changes limit 2;
+--------------------------------+--------------------------------+--------------------------------+
|TITLE                           |ID                              |RELEASE_YEAR                    |
+--------------------------------+--------------------------------+--------------------------------+
|Life of Brian                   |42                              |1986                            |

Examine the underlying topic

ksql> print movies;
Key format: KAFKA_STRING
Value format: JSON or KAFKA_STRING
rowtime: 2021/02/22 11:01:05.966 Z, key: Aliens, value: {"ID":48,"RELEASE_YEAR":1986}, partition: 0
rowtime: 2021/02/22 11:02:00.194 Z, key: Life of Brian, value: {"ID":42,"RELEASE_YEAR":1986}, partition: 0
rowtime: 2021/02/22 11:04:52.569 Z, key: Aliens, value: <null>, partition: 0
Robin Moffatt
  • 30,382
  • 3
  • 65
  • 92
  • Thank you Robin for the workaround. This should be reported to the ksqldb team. It should be easier to insert deleted tombstone values (without intermediate streams or value formats). I haven't received the row.tombstone in the response json object in a [push query](https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-rest-api/query-endpoint/#json-parameters) in this way. – Kubus Feb 22 '21 at 15:26
  • Feel free to upvote/mark as correct if you feel it is ;) You can log issues & suggest enhancements to ksqlDB here: https://github.com/confluentinc/ksql/ – Robin Moffatt Feb 22 '21 at 15:29
  • 1
    sure I upvoted you, I was just wondering why I didn't get a row similar to this {"row":{"columns":['Alien', null,null],"tombstone":true}} via Rest API, because it was my original issue :) Next time I will be more verbose, but I expected that it will work, after I achieve to insert the tombstone value. thx again – Kubus Feb 22 '21 at 15:57
  • I am using KSQL 6.2.0, My Table has composite key. I could see following in the PRINT {tbl} FROM BEGINNING; rowtime: 2021/08/08 17:43:27.579 Z, key: {"r_n":"123-234","k_n":1,"k_2":1,"country":"IND","t":"image","d":"2021-08-08"}, value: , partition: 0 But still the data exists in the table. Any help here ? – Shivakumar ss Aug 08 '21 at 17:50
  • + Even the rest api is returning the data without the tombstone attribute. – Shivakumar ss Aug 08 '21 at 17:51
  • @ShivaKumarSS Please start a new question. – Robin Moffatt Aug 09 '21 at 19:35