3

From the doc: https://pulsar.apache.org/docs/en/cookbooks-retention-expiry/#get-the-ttl-configuration-for-a-namespace, it is a little bit confusing about the difference between backlog quotas and TTL.

As I understand so far, a message arrives broker, and broker will find out all subscriptions on that topic, and retrieve their backlog, and put the message to those backlog. If this message is acknowledged by one subscription, it will be removed from its backlog (backlog is per subscription). If the message is not in any backlog (means all subscription acknowledged it), then this message is considered as acknowledged, and then retention policy kicks in, to decide if it needs to be deleted or keep for some time.

If a message is not acknowledged in one backlog for some time, and the backlog quota reaches a size limitation, then backlog retention policy kicks in. So this is more about size than time. And if we use consumer_backlog_eviction, this message will be discarded from the backlog, but question, is that considered acknowledged or not? so the first retention policy kicks in?

And the TTL, if a message is not acknowledged for some time, will it be removed from all backlogs? and then considered as acknowledged and then let the first retention policy handle it?

UPDATE:

to be more precise of this question:

In backlog quotas document, it says:

consumer_backlog_eviction: The broker will begin discarding backlog messages

Discarding means, making it acknowledged? So that the global retention policy can kick in?

producer_request_hold: The broker will hold and not persist produce request payload

Is it saying, that, it will not put new messages into the backlog, but for those new coming messages, are they automatically acknowledged or not (say there is just one subscription at that moment)? And does this block the real producer (I guess not, it is just that the broker won't put new messages into the backlog anymore)

(for TTL) If disk space is a concern, you can set a time to live (TTL) that determines how long unacknowledged messages will be retained.

Again, if TTL is exceeded, it will not "retain" it, means, make it acknowledged? or just throw it away?

Xiang Zhang
  • 2,831
  • 20
  • 40

1 Answers1

1

And if we use consumer_backlog_eviction, this message will be discarded from the backlog, but question, is that considered acknowledged or not? so the first retention policy kicks in?

The message will be acknowledged and marked for deletion. Then the retention policy for acknowledged messages will kick in at some point depending on the configuration.

And the TTL, if a message is not acknowledged for some time, will it be removed from all backlogs? and then considered as acknowledged and then let the first retention policy handle it?

The TTL should be applied to all backlogs and outdated unconsumed messages will be automatically acknowledged. And again the retention policy for acknowledged messages will kick in.

Sergii Zhevzhyk
  • 4,074
  • 22
  • 28
  • Thanks for the clarification. I've just updated my question, so how about the `producer_request_hold` policy? Does it mean that the new message will not be put into the backlog and automatically acknowledged? (assume only one backlog at that moment) – Xiang Zhang Mar 17 '20 at 10:53
  • It seems that if the `producer_request_hold` policy is enabled and the quota is exceeded the pulsar broker will start to disconnect producers to reduce the load. – Sergii Zhevzhyk Mar 17 '20 at 11:11
  • That is the second policy `producer_exception`, so I guess `producer_request_hold` means "Drop Latest", so new message should be acknowledged, while `consumer_backlog_eviction` try to achieve "Drop Oldest" semantic, where the oldest message in backlog get acknowledged. – Xiang Zhang Mar 17 '20 at 11:32
  • Implementation of `producer_exception` and `producer_request_hold` are very similar. Both of them are holding producers from sending new messages. The difference is only in the way how this problem is communicated to the client. `producer_exception` will trow an exception on the client side, while `producer_request_hold` will display a warning. In general, it is a very interesting topic and probably it requires its own question on SO or in [slack](https://apache-pulsar.slack.com/) – Sergii Zhevzhyk Mar 17 '20 at 12:02
  • Yes I guess it is hard to achieve "Drop Latest", because then the order of message is undefined, you have some messages waiting in backlog while some new message which is acknowledged. – Xiang Zhang Mar 17 '20 at 12:08