0

I'm trying to process files that are dropped into GCP buckets in order, however, I can't seem to find a way to set an orderingKey on the StorageNotification.

https://cloud.google.com/pubsub/docs/publisher#using-ordering-keys

Looking at this page, there's this line:

You can publish messages with ordering keys using the Google Cloud console, the Google Cloud CLI, or the Pub/Sub API.

Does that mean that it is NOT possible to do so via a StorageNotification? In which case... I'm now trying to figure out what options I have to be able to enable an ordering key on GCS buckets...

Any clarifications/thoughts?

Thank you

Jty.tan
  • 808
  • 9
  • 25
  • No, it's not possible to define an ordering key on Cloud Storage event. there are no easy solution to that questions. Cloud Storage being a distributed system, what does "order" mean? – guillaume blaquiere Aug 03 '23 at 07:46

1 Answers1

1

GCP Pub/Sub notifications for Cloud Storage do not support ordering keys, no. In order for ordering keys to even work in the case you are talking about, you would need it to be the case that:

  1. All messages use the same ordering key. For a bucket with many files, this may not be practical.
  2. All publishing of the messages would have to happen in the same region since the Pub/Sub guarantees around ordered delivery with ordering keys requires the messages to be published in a single region.

If you want to process messages in order, you could consider using Dataflow and its ability to ensure ordering based on a timestamp attribute in the Pub/Sub messages. See "Stream messages from Pub/Sub by using Dataflow":

If you would like to window Pub/Sub messages by a custom timestamp, you can specify the timestamp as an attribute in the Pub/Sub message, and then use the custom timestamp with PubsubIO's withTimestampAttribute

Cloud Storage notifications include eventTime as an attribute on all published messages.

Kamal Aboul-Hosn
  • 15,111
  • 1
  • 34
  • 46
  • thank you for the clarification. Just for my additional clarification... If an `orderingKey` is set, does that mean that PubSub will NOT trigger subsequent notifications for that `orderingKey` until the previous one has been ack-ed or nack-ed? *will* it actually trigger the next one if the previous one was nack-ed? thank you – Jty.tan Aug 07 '23 at 23:47
  • I'm not sure what you mean by "trigger subsequent notification." That is not a phrase that is meaningful in Pub/Sub. – Kamal Aboul-Hosn Aug 09 '23 at 10:19