7

The documentation on pubsub pricing is very minimal. Can someone explain the costs for the scenario below ?

  • Size of the data per event = 0.5 KB
  • Size of data per day = 1 TB

There is only one publisher app and there are two dataflow pipeline subscriptions.

The very rough estimate I can come up with is:

  • 1x publishing
  • 2x subscription (1x for each subscription)
  • 2x acknowledgment (1x for each subscription ack)

The questions are:

  1. Is total data volume per month, 150 (30* 1 TB * 5x) TB? That is 8000$ per month from the price calculator.
  2. 1 KB min size for the calculation is applicable even for acknowledging a message?
  3. Dataflow handles subscribe/acknowledge in bundles of ParDos. But, Is the bundle for each message acknowledged separately?
Andrew Mo
  • 1,433
  • 9
  • 12
mmziyad
  • 298
  • 1
  • 4
  • 16

1 Answers1

3

One does not pay for acknowledgements in Google Cloud Pub/Sub, only for publishes, pulls, and pushes. With messages of size 0.5KB, the amount you'd get charged would depend on the batching because of the 1KB minimum size. If all requests had at least 1KB, then the total cost for publishing and getting messages to two subscribers would be:

1TB/day * 30 days * 3 = 92,160GB/month

10GB * $0 + 92,150GB * $0.04 = $3,686

If some messages were not batched, then the price could go up because of the 1KB minimum. The Google Cloud Pub/Sub client library does batch published messages by default, so assuming your messages were not published very sporadically (meaning they were not frequent enough to result in batching), you would hit the 1KB minimum. With the amount of data, you are probably going to end up with batching on your subscribe side as well.

Kamal Aboul-Hosn
  • 15,111
  • 1
  • 34
  • 46
  • Thanks for the clarification on Acknowledgment. I was confused it with [quota limits](https://cloud.google.com/pubsub/quotas), where acknowledgment is also counted. Using the pubsub client API we will be able to batch the requests in publishing. How can we batch the reads in cloud dataflow subscriber pipelines? – mmziyad Jan 14 '18 at 10:22
  • Batches from Cloud Dataflow will likely be inherently batched. Unless the `max_messages` property is set to 1, pull requests made to Pub/Sub sends multiple messages as a batch to a subscriber when they are available. – Kamal Aboul-Hosn Jan 15 '18 at 16:40
  • @KamalAboul-Hosn can u explain how do u get the number 0.06 and 0.05 while making cost analysis – Chinmay Aug 15 '20 at 04:10
  • @Chinmay This was based on the old pricing model which had multiple tiers. I have updated the answer to reflect the more recent pricing model that charges $40/TiB. – Kamal Aboul-Hosn Aug 17 '20 at 11:14
  • @KamalAboul-Hosn yeah I thinking on same lines that those numbers seems from old metrics for cost, thank you for the update. I have one more question, do we need to consider the network cost? eg. assume there are topics in Pub/Sub, and it has single subscription. Those subscription are pulled by jobs using kakfa topics, hence messages are sent from Pub/Sub to Kafka , is there any cost involved here ? network cost etc? – Chinmay Aug 17 '20 at 22:53