1

I have a blob storage container where I have configured a Event-grid trigger (Blob Created). I am loading this blob storage files through Data factory and many times it will happen that many files may come up in this blob in one shot. May be we can take an example of 20 files.

The good news is my event-grid trigger is firing and the function app is called. However , I can see that sometimes for the same file the event-grid trigger is fired more than once.

Out of these 20 files there are few files which are very large say 300 MB but others are pretty small like in 3KBs. So my doubt is while this 300 MB is fired and it is still processing , parallelly it picks the same 300 MB file again (since it feels that it is still not read) and is saved in db multiple times which is not want I want.

Is Azure Event-grid would be the right approach for this scenario ?

amit agarwal
  • 63
  • 2
  • 17
  • 1
    it looks like, your subscriber such as a ADF pipeline takes sometime more than 30 seconds, see more details in the https://learn.microsoft.com/en-us/azure/event-grid/delivery-and-retry#retry-schedule-and-duration – Roman Kiss Nov 23 '20 at 16:17
  • No... All my files through ADF is copied within around 40 secs. So I believe ADF is not the problem. – amit agarwal Nov 24 '20 at 11:40
  • try to read that link again, there is described what is happen during the delivery if the AEG will not received a response from the subscriber (ADF) within the 30 seconds, therefore, when the ADF sent the response latter (such as ~40 seconds), the AEG starts the retrying delivery based on the scheduler time, more details can be found in that link. Also, have a look at the AEG metrics for delivery time, etc. – Roman Kiss Nov 24 '20 at 12:01
  • Yeah you are right it takes more than 30s due to which it is retrying. Is there a possibility that if this retrying / response time can be changed to say 10 mins or so or may be it should never trigger if no response is received. – amit agarwal Nov 25 '20 at 05:48
  • @RomanKiss, how is this 30 sec period considered , does AEG simply waits for event delivery ACK or it includes event trigger function processing time as well . One more thing , we can't send a HTTP response from azure event trigger function , so this is acknowledgement from event trigger function handled implicitly ? In my scenario , the the event trigger function invoked within 3 sec(T+3sec) after event is published(T) , but the function itself takes around 2/3 mins to complete processing . Still I see the retry happened at (T+3 mins) – Pintu Jun 12 '21 at 11:11
  • @Pintu, I do recommend to read the https://learn.microsoft.com/en-us/azure/event-grid/delivery-and-retry#retry-schedule-and-duration, where is described this process in detail. – Roman Kiss Jun 12 '21 at 11:33
  • "After 30 seconds, if the endpoint hasn’t responded, the message is queued for retry" , but event trigger function can't return a HTTP response , am i missing something , i tried to update the function.json to return a HTTP response it gives error . – Pintu Jun 12 '21 at 14:02

2 Answers2

1

Once event grid trigger Azure Function for an event. It is excepting some reply from azure function in next 2 mins. If there is no response, Else Event grid will retry again.

Default value of retry for event grid is 30. Change it to 1.

Now even for large files that process for more than 2 mins, second duplicate trigger will not happen.

0

I can see, you opened a new thread for handling your problem which also is based on the Push-Push pattern for the event processing.

I do recommend to use the Push-Pull pattern, where the AEG will deliver an event to the queue storage and then based on the needs can be processed by Azure Queue Trigger function in the concurrent manner, see more details here.

Notes:

  1. The default timeout of the AF for Consumption plan is 5 minutes with a maximum 10 minutes. If this time is not enough for processing a large blob, you should use a Premium Plan where the default timeout is 30 minutes with a maximum time 60 minutes.

  2. In the case when you want to keep your business logic in the ADF pipeline, have a look at here how easy can be invoked the pipeline from the azure function.

Roman Kiss
  • 7,925
  • 1
  • 8
  • 21