2

Here is my flow for my imports:

  • When a new file is detected on the blob storage and event is triggered into the Event Grid
  • The Event grid retry until he is able to call the Azure Function
  • The Azure Function inject the event into the Service Bus's Queue
  • A webapp will consume the Queue

So I guess that this process is very resilient because each message is stored or retried. The only step that could fail is the connection between the storage and the event grid. What if the connection between the Storage and the event grid is down when a file is created on the storage. How can I be sure that the event will still be triggered?

abatishchev
  • 98,240
  • 88
  • 296
  • 433
Kapoue
  • 847
  • 2
  • 11
  • 15
  • Why not have the event grid create the queue item directly? – NotFound Dec 13 '21 at 14:54
  • @404 you still have the same issue that the event grid might be unable to react to the blob storage changes. – Peter Bons Dec 13 '21 at 14:58
  • @PeterBons Well true although it's will be very unlikely if you set up a proper retry policy in your event subscription. It also severely limits the points of failure from the OP's post. – NotFound Dec 13 '21 at 15:01
  • @404 it seems to me the OP questions the reliability of the storage account being able to send events, which is outside the subscriptions control. It is not the consuming part OP is worried about since indeed there are retry policies for that. – Peter Bons Dec 13 '21 at 15:03
  • according to the doc "Storage events guarantees at-least-once delivery to subscribers, which ensures that all messages are outputted" so Microsoft provide the resilience. Your other resilience techniques would deal with errors after storage has delivered to Event Grid. https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blob-event-overview – codebrane Dec 13 '21 at 15:10

1 Answers1

1

... So I guess that this process is very resilient because each message is stored or retried. The only step that could fail is the connection between the storage and the event grid. What if the connection between the Storage and the event grid is down when a file is created on the storage. How can I be sure that the event will still be triggered?

You can't. Though it is an at least once delivery system so internally it has mechanisms in place. The Blob Storage is an Azure services that support system topics, it is outside your control. There could be an outage impacting the service though. There is no service level agreement for server-side disaster recovery (which is what we are dealing with in such scenario):

There is no service level agreement (SLA) for server-side disaster recovery. If the paired region has no extra capacity to take on the additional traffic, Event Grid cannot initiate failover. Service level objectives are best-effort only.

You can find the Recovery point objective online:

Recovery point objective (RPO)

  • Metadata RPO: zero minutes. Anytime a resource is created in Event Grid, it's instantly replicated across regions. When a failover occurs, no metadata is lost.
  • Data RPO: If your system is healthy and caught up on existing traffic at the time of regional failover, the RPO for events is about 5 minutes.

Recovery time objective (RTO)

  • Metadata RTO: Though generally it happens much more quickly, within 60 minutes, Event Grid will begin to accept create/update/delete calls for topics and subscriptions.
  • Data RTO: Like metadata, it generally happens much more quickly, however within 60 minutes, Event Grid will begin accepting new traffic after a regional failover.

So data loss is extremely unlikely to happen but there is no 100% guarantee. Should you now be worried? Probably not due to the very low chances.

Peter Bons
  • 26,826
  • 4
  • 50
  • 74