0

I have a Azure Function (v2) that monitors a blob container and triggers on new blobs. The function was working fine until it stopped unexpectedly. We have since diagnosed the issue to being a result of the logs no longer being written (see question on MS Forums).

As I understand, an Azure Function monitors blobs directly until there are more than 10k blobs in a container (see this document). This was the case with my function - I had over 10k blobs so the logs were being monitored. I have since deleted a majority of my blobs, leaving only a few hundred in each container, include those located in the $log container (a couple thousand among all containers). My function still does not fire on new blobs, indicating that the logs are still being monitored (which are not working correctly).

My question is, how does the Function runtime decide to poll blobs directly or use logs? And how do I get the runtime to stop monitoring log files?

brudert
  • 537
  • 8
  • 21
  • 1
    I do recommend to use an Azure Event Grid eventing Pub/Sub model for your solution. It's a Push eventing model comparing to the BlobTrigger function such as a Pull/Poll-Push model. – Roman Kiss Dec 01 '18 at 08:27
  • Thanks. Yeah, Event Grid has been recommended numerous times. I'm starting to wonder if a blob trigger is useful for anything more than a POC? – brudert Dec 03 '18 at 18:14

1 Answers1

0

My experience is that - at high volumes - blob triggers can be hit or miss. They may not catch every event. Queue triggers are very reliable. We're using them to handle 50K blobs a day. If it is business critical, I'd recommend using a queue + blob architecure. The MS article you linked too now nudges folks in the same direction.

In addition, storage logs are created on a "best effort" basis. There's no guarantee that all events are captured. Under some conditions, logs may be missed.

If you require faster or more reliable blob processing, consider creating a queue message when you create the blob. Then use a queue trigger instead of a blob trigger to process the blob. Another option is to use Event Grid; see the tutorial Automate resizing uploaded images using Event Grid.

I've not used Event Grid, but it sounds like overkill? For a queue + blob architecture we will follow the claim-check pattern. In a nutshell, whatever triggers a new blob to be written, have it also write a message to the queue. The message can be the blob url. Then use a queue triggerred function to monitor the queue, get the new queue message, with the blob url, and take action on the blob. The queue trigger won't miss an event. Every blob will be handled.

Troy Witthoeft
  • 2,498
  • 2
  • 28
  • 37