0

I am a newbie to Apache Pulsar (also to MQ system). Now, I have a question about Pulsar Reader.

Question description:
I launch a Pulsar instance, then start a consumer, listen on a topic A. Then I start a producer, send 100 messages to topic A, and the consumer consumed the 100 messages, the value of Backlog in consumer's subscription is 0.There is only one subscription on the topic, and it's exclusive.

After that, I start a Reader, set Reader's topic is A, Reader can get messges from topic A.

I found this on Pulsar docs: https://pulsar.apache.org/docs/en/cookbooks-retention-expiry/

Pulsar brokers are responsible for handling messages that pass through Pulsar, including persistent storage of messages. By default, brokers:

immediately delete all messages that have been acknowledged on every subscription,
and persistently store all unacknowledged messages in a backlog.

The 100 messages should alreadly be deleted. So why did Pulsar Reader can still get messages from topic A?

my code:

consumer:

private static void consume() {
        try {
            PulsarClient pulsarClient = PulsarClient.builder().serviceUrl("pulsar://127.0.0.1:6650").build();
            Consumer<String> consumer = pulsarClient.newConsumer(Schema.STRING)
                    .topic("A")
                    .subscriptionName("first-subscription")
                    .subscribe();

            for (int i = 0; true; ++i) {
                try {
                    Message<String> msg = consumer.receive();
                    String m = msg.getValue();
                    System.out.println("\t m:" + m);
                    consumer.acknowledge(msg);
                    Thread.sleep(500);
                } catch (Exception e) {
                    LOGGER.error("", e);
                }
            }
        } catch (Exception e) {
            LOGGER.error("", e);
        }
    }

producer:

private static void produce() {
        try {
            PulsarClient pulsarClient = PulsarClient.builder().serviceUrl("pulsar://127.0.0.1:6650").build();
            Producer<String> producer = pulsarClient.newProducer(Schema.STRING).topic("A").create();
            for (int i = 0; i < 100; ++i) {
                producer.send("producer-simple-partitioned-" + i);
            }
        } catch (Exception e) {
            LOGGER.error("", e);
        }
    }

Reader:

private static void readerRead() {
        try {
            PulsarClient pulsarClient = PulsarClient.builder().serviceUrl("pulsar://127.0.0.1:6650").build();
            Reader<byte[]> reader = pulsarClient.newReader()
                    .topic("A")
                    .startMessageId(MessageId.earliest)
                    .create();
            while (true) {
                Message message = reader.readNext();
                System.out.println(new String(message.getData()));
            }
        } catch (PulsarClientException e) {
            LOGGER.error("", e);
        }
    }

1 Answers1

1

The behavior you experienced with the Reader interface seems like a possible scenario. The documentation is not correct in this case, because messages are not explicitly deleted after their acknowledgment. Messages are available for reading if the ledger containing them is still there.

Actually, this part of the documentation will be changed in the future release:

Note that messages that are no longer being stored are not necessarily immediately deleted, and may in fact still be accessible until the next ledger rollover. Because clients cannot predict when rollovers may happen, it is not wise to rely on a rollover not happening at an inconvenient point in time.

and

By default, when a Pulsar message arrives at a broker it will be stored until it has been acknowledged on all subscriptions, at which point it will be marked for deletion.

Sergii Zhevzhyk
  • 4,074
  • 22
  • 28