1

I am looking into building a simple solution where producer services push events to a message queue and then have a streaming service make those available through gRPC streaming API.

Cloud Pub/Sub seems well suited for the job however scaling the streaming service means that each copy of that service would need to create its own subscription and delete it before scaling down and that seems unnecessarily complicated and not what the platform was intended for.

On the other hand Kafka seems to work well for something like this but I'd like to avoid having to manage the underlying platform itself and instead leverage the cloud infrastructure.

I should also mention that the reason for having a streaming API is to allow for streaming towards a frontend (who may not have access to the underlying infrastructure)

Is there a better way to go about doing something like this with the GCP platform without going the route of deploying and managing my own infrastructure?

Alexandre Thenorio
  • 2,288
  • 3
  • 31
  • 50
  • not sure I follow about the scaling note, cloud pub sub allows for multiple subscribers on the same subscription to scale throughput up and down, what exactly did you mean? – Elad Amit Mar 17 '19 at 18:38
  • The idea with the gRPC streaming service would be to stream *all* events coming into the pubsub topic. If 2 instances of the same service both subscribe to the same subscription this means they will each receive about 50% of the incoming events which means anyone calling the API on any of these 2 instances would receive only half the events. For the streaming service to work each instance needs to receive 100% of the events meaning they cannot subscribe to the same subscription. Maybe I am going the wrong about this? – Alexandre Thenorio Mar 18 '19 at 10:16
  • When you say "that seems unnecessarily complicated and not what the platform was intended for," to which "platform" are you referring? Subscriptions in Cloud Pub/Sub are essentially the equivalent of consumer groups in Kafka. Is there a major difference you see between the two? – Kamal Aboul-Hosn Mar 18 '19 at 18:19
  • it kinda sounds like you want each instance to keep track of all events, if you are adding instances to increase query capacity, you may want to separate responsibilities a la CQRS (i.e. query nodes querying a data store and ingest nodes pushing the events into it) – Elad Amit Mar 19 '19 at 09:29
  • @KamalAboul-Hosn What I am referring to is generating a new subscription on the fly for every instance of a service for what is essentially ephemeral messages and making sure that subscription is deleted when the instance goes down so it does not cost money. In kafka I can just have one topic and attach separate subscribers who can each keep track of their own offset in the queue independently – Alexandre Thenorio Mar 19 '19 at 09:31
  • @EladAmit The issue here is I want to stream the messages to something like a frontend service (ie Websockets, GRPC, GraphQL, etc...). Because the frontend will not have access to the underlying infrastructure, having it talk to pubsub directly is not an option therefore I need a service with an API handling that and serving gRPC streams. That service needs to stream that data from somewhere and right now it is pubsub as it makes sense for the data to come in there – Alexandre Thenorio Mar 19 '19 at 09:34
  • in that case i'm going to echo what @kamal mentioned above, you will have this problem whether you're talking about kafka or pub/sub as it is just how these things work. it kinda sounds like what you are after is a higher abstraction like firebase (or one of the other fire* services) – Elad Amit Mar 20 '19 at 11:56
  • Cloud Pub/Sub is designed to manage the set of acked messages for you, so subscriptions have state, e.g., the set of messages that you have acked. Kafka has this, too. You can configure the offset.retention.minutes property for how long Kafka retains the offset for your consumer. If you're not having Kafka retain your offset and making each consumer maintain this itself, then I suppose you can have a notion of an "ephemeral subscriber." But it is true that this just isn't what Cloud Pub/Sub is designed to do. The fire* services as @EladAmit mentions are a better match for this use case. – Kamal Aboul-Hosn Mar 20 '19 at 15:33
  • Thanks for the input. Issue with the fire* services is that firestore has a 10000 writes/s limitation right now which might not be well suited for high amount of messages. – Alexandre Thenorio Mar 27 '19 at 22:27

2 Answers2

3

If you essentially want ephemeral subscriptions, then there are a few things you can set on the Subscription object when you create a subscription:

  1. Set the expiration_policy to a smaller duration. When a subscriber is not receiving messages for that time period, the subscription will be deleted. The tradeoff is that if your subscriber is down due to a transient issue that lasts longer than this period, then the subscription will be deleted. By default, the expiration is 31 days. You can set this as low as 1 day. For pull subscribers, the subscribers simply need to stop issuing requests to Cloud Pub/Sub for the timer on their expiration to start. For push subscriptions, the timer starts based on when no messages are successfully delivered to the endpoint. Therefore, if no messages are published or if the endpoint is returning an error for all pushed messages, the timer is in effect.

  2. Reduce the value of message_retention_duration. This is the time period for which messages are kept in the event a subscriber is not receiving messages and acking them. By default, this is 7 days. You can set it as low as 10 minutes. The tradeoff is that if your subscriber disconnects or gets behind in processing messages by more than this duration, messages older than that will be deleted and the subscriber will not see them.

Subscribers that cleanly shut down could probably just call DeleteSubscription themselves so that the subscription goes away immediately, but for ones that shut down unexpectedly, setting these two properties will minimize the time for which the subscription continues to exist and the number of messages (that will never get delivered) that will be retained.

Keep in mind that Cloud Pub/Sub quotas limit one to 10,000 subscriptions per topic and per project. Therefore, if a lot of subscriptions are created and either active or not cleaned up (manually, or automatically after expiration_policy's ttl has passed), then new subscriptions may not be able to be created.

Kamal Aboul-Hosn
  • 15,111
  • 1
  • 34
  • 46
  • Thank you for the tips and that is exactly what I am doing right now (Except setting the expiration because the Go API does not support that yet). It just feels like a workaround due to the way pubsub works rather than the proper way about solving this type of problem. I guess there is just no other way of solving this with GCP current offered services? – Alexandre Thenorio Mar 19 '19 at 14:53
0

I think your original idea was better than ephemeral subscriptions tbh. I mean it works, but it feels totally unnatural. Depending on what your requirements are. For example, do clients only need to receive messages while they're connected or do they all need to get all messages?

Only While Connected

Your original idea was better imo. What I probably would have done is to create a gRPC stream service that clients could connect to. The implementation is essentially an observer pattern. The consumer will receive a message and then iterate through the subscribers to do a "Send" to all of them. From there, any time a client connects to the service, it just registers itself with that observer collection and unregisters when it disconnects. Horizontal scaling is passive since clients are sticky to whatever instance they've connected to.

Everyone always get the message, if eventually

The concept is similar to the above but the client doesn't implicitly un-register from the observer on disconnect. Instead, it would register and un-register explicitly (through a method/command designed to do so). Modify the 'on disconnected' logic to tell the observer list that the client has gone offline. Then the consumer's broadcast logic is slightly different. Now it iterates through the list and says "if online, then send, else queue", and send the message to a ephemeral queue (that belongs to the client). Then your 'on connect' logic will send all messages that are in queue to the client before informing the consumer that it's back online. Basically an inbox. Setting up ephemeral, self-deleting queues is really easy in most products like RabbitMQ. I think you'll have to do a bit of managing whether or not it's ok to delete a queue though. For example, never delete the queue unless the client explicitly unsubscribes or has been inactive for so long. Fail to do that, and the whole inbox idea falls apart.

enter image description here

The selected answer above is most similar to what I'm subscribing here in that the subscription is the queue. If I did this, then I'd probably implement it as an internal bus instead of an observer (since it would be unnecessary) - You create a consumer on demand for a connecting client that literally just forwards the message. The message consumer subscribes and unsubscribes based on whether or not the client is connected. As Kamal noted, you'll run into problems if your scale exceeds the maximum number of subscriptions allowed by pubsub. If you find yourself in that position, then you can unshackle that constraint by implementing the pattern above. It's basically the same pattern but you shift the responsibility over to your infra where the only constraint is your own resources.

enter image description here

gRPC makes this mechanism pretty easy. Alternatively, for web, if you're on a Microsoft stack, then SignalR makes this pretty easy too. Clients connect to the hub, and you can publish to all connected clients. The consumer pattern here remains mostly the same, but you don't have to implement the observer pattern by hand.

(note: arrows in diagram are in the direction of dependency, not data flow)

Sinaesthetic
  • 11,426
  • 28
  • 107
  • 176