4

I know , in KSQL we can set offset to earliest or latest But can we get data from specific time period i.e I need to get data inserted to a topic from 06-May-2020 ?

Matthias J. Sax
  • 59,682
  • 7
  • 117
  • 137
Raghava Reddy
  • 313
  • 4
  • 11

1 Answers1

5

In ksqlDB you can query from the beginning (SET 'auto.offset.reset' = 'earliest';) or end of a topic (SET 'auto.offset.reset' = 'latest';).

You cannot currently (0.8.1 / CP 5.5) seek to an arbitrary offset.

What you can do is start from the earliest offset and then use ROWTIME in your predicate to identify messages that match your requirement.

SELECT * 
  FROM MY_SOURCE_STREAM 
WHERE  ROWTIME>=1588772149620

Note that this scans through sequentially so depending on how much data you have in your topic may not be particularly fast.

Robin Moffatt
  • 30,382
  • 3
  • 65
  • 92
  • 1
    Thanks @Robin Moffatt. I can understand setting earliest and getting data is a performance issue. I can use this solution when KSQL struggles to stream data . Can you please suggest how to add ROWTIME in predicate in KSQL query. Please provide a sample KSQL query if my ROWTIMW is '1588772149620' – Raghava Reddy May 06 '20 at 13:39
  • i've added an example – Robin Moffatt May 06 '20 at 14:04
  • Can i get messages from ROWTIME=1588772149620 ? I believe this query will fetch only single record . Basically KSQL streams are getting stopped if no data a pushed for couple of days . So i wan a restart a stream from last message which I consumed . – Raghava Reddy May 06 '20 at 15:42
  • I've updated my example to use `greater than or equal to` instead of `equal to` in the predicate – Robin Moffatt May 06 '20 at 16:22
  • If you had a consumer then it was restarted, is there any way to get the records that were missed during the restart process? – Mohamed Ahmed Taher Mohamed Jun 18 '23 at 09:28