1

Why Pub/Sub is used? Usecase: There is a http triggered "Cloud function" on which some Data is sent. This function after processing data, publishes Data to a Pub/Sub topic.

Then there is another Cloud function, which is triggered based on publishing to that Pub/sub topic. So, This Cloud function takes published data from pub/sub and inserts it to a BigQuery table.

So in this usecase why pub sub is used, why can't we just have one cloud function which takes data from http hit, and inserts it to BigQuery. What cam be the design thought given for choosing pub/sub here?

Also generally why Pub/sub architecture is used?

Nandan Raj
  • 105
  • 3
  • 13

1 Answers1

2

There could be several reasons to use Cloud Pub/Sub in this architecture. One reason would be if there is any fan out or plans to fan out, where the published data ends up not only in BigQuery, but in some other place. Without Pub/Sub, the http-triggered Cloud Function would have to know about all interested receivers of the data and send it to each one of them. With Pub/Sub, any additional service that was interested in the incoming data could create a separate subscription on the data and consume it independently of BigQuery.

Another reason would be to be able to batch data inserts into BigQuery without increasing latency on the http requests to the initial Cloud Function. This may be done for efficiency or for doing some kind of cross-event pre-processing on the data coming in. By using Pub/Sub, the first Cloud Function can respond to the request as soon as the publish to Cloud Pub/Sub succeeds, not having to wait for any other requests, and can be sure that the request will ultimately be processed.

In the absence of these two either now or in the future, it may make sense to write to BigQuery directly.

Kamal Aboul-Hosn
  • 15,111
  • 1
  • 34
  • 46