0

I deployed a redpanda cluster, and would like to query offset by timestamp.

I first tried confluent-kafka python library:

import confluent_kafka as ck
import uuid

c = ck.Consumer({
    'bootstrap.servers': 'redpanda-bootstrap.example.com:9094',
    'group.id': f'test-{uuid.uuid4()}',
})

tp = ck.TopicPartition('log-feed-test', 0, 1689584185555)
print(tp)
tp = c.offsets_for_times([tp])
print(tp)

This returns me -1, which means all data is before that timestamp. But I'm sure there is data that is after that timestamp, because when I consume it by setting offset to latest, I can print the msg.timestamp(), which gives me (1, 1689586682955). here 1 means it is a valid timestamp.

I also tried pykafka, same, it returns -1 as the offset. I also tried with kafka's consumer group script to try to reset the group's offset by time, it always set it to the latest time.

It looks to me more like a redpanda issue, which doesn't support this feature.

I even use redpanda-console, in the topic, I config start offset by timestamp, and it still load the latest 1 message for me.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Xiang Zhang
  • 2,831
  • 20
  • 40

1 Answers1

0

Ok, issue found. It is not in redpanda, it is in the producer. I use sarama go library for producing, and that has this issue. Now I switch to franz-go as producer, it works.

Xiang Zhang
  • 2,831
  • 20
  • 40