I deployed a redpanda cluster, and would like to query offset by timestamp.
I first tried confluent-kafka
python library:
import confluent_kafka as ck
import uuid
c = ck.Consumer({
'bootstrap.servers': 'redpanda-bootstrap.example.com:9094',
'group.id': f'test-{uuid.uuid4()}',
})
tp = ck.TopicPartition('log-feed-test', 0, 1689584185555)
print(tp)
tp = c.offsets_for_times([tp])
print(tp)
This returns me -1, which means all data is before that timestamp. But I'm sure there is data that is after that timestamp, because when I consume it by setting offset to latest
, I can print the msg.timestamp()
, which gives me (1, 1689586682955)
. here 1 means it is a valid timestamp.
I also tried pykafka
, same, it returns -1 as the offset. I also tried with kafka's consumer group script to try to reset the group's offset by time, it always set it to the latest time.
It looks to me more like a redpanda issue, which doesn't support this feature.
I even use redpanda-console, in the topic, I config start offset by timestamp, and it still load the latest 1 message for me.