2

I have an Azure Function triggered by Event Grid Events. Event Grid Events are created only when a blob is uploaded to a Storage Account.

This is now deployed and working well although for some reason, the Function keeps getting triggered by the same event even though it successfully processed?

Example:

  • Yesterday, 8 successful tests; all good: enter image description here

  • Today I review the logs, and the function has continued to execute!

  • Error: "Blob does not exist"

  • I deleted the blob after last test yesterday. Why is the Event Grid still firing? enter image description here

Code snippet:

def main(event: func.EventGridEvent):

        result = json.dumps({
            'id' : event.id,
            'data' : event.get_json(),
            'topic' : event.topic,
            'subject' : event.subject,
            'event_type' : event.event_type
        })

        logging.info('EventGrid trigger processing an event: %s', result)

        credential = DefaultAzureCredential()

        download_start_time = datetime.datetime.now()
        logging.info(f'######## Starting downloading blob from storage at ' + str(download_start_time) + ' ########')

        # =============================
        # Download blob from storage container:
        # =============================

        blob_client = BlobClient.from_blob_url(event.get_json()["url"], credential)
        blob_data = blob_client.download_blob().readall()
        blob_byte_stream = io.BytesIO(blob_data)

EDIT 1: This is still happening, this time a bit different.

  • Now, the EventGrid keeps firing after SUCCESSFULLY passing messages and the Function running

How do I debug this?

enter image description here

ericOnline
  • 1,586
  • 1
  • 19
  • 54
  • It seems something wrong with your stack. Are you using virtual environment? And can you post some details like your testing code? – Doris Lv Oct 08 '20 at 09:27
  • I'm using a `.venv` in my local dev env, but this Function is hosted and being triggered in Azure, not locally. I do not have any testing code to share. The Function has now ceased to trigger, but I'd like to understand how to debug this issue in the future. – ericOnline Oct 08 '20 at 18:13
  • Okay...Since we could not see your code, I'm not convinced the reason. But I saw there is a _download.py file, are you trying to download the blob already deleted? Maybe you should check that.@ericOnline – Doris Lv Oct 09 '20 at 02:12
  • Does that help @DorisLv? This is the only part of the Function that downloads the blob from storage. After the Function successfully triggers via EventGrid message, why would it continue to trigger? To me, the **trigger** is the issue, not the `BlobClient`. – ericOnline Oct 09 '20 at 03:04
  • I faced the same problems sometimes, when the functions doesn't succeed immediately. Event Grid provides at least once delivery guarantees so subscribers of the events should account for duplicates, see [Event Grid message delivery and retry](https://learn.microsoft.com/azure/event-grid/delivery-and-retry). – sschmeck Oct 09 '20 at 06:59
  • Yeah, in this case, it seems to be a single test.csv file uploaded to a certain Container. Keeps retrying. Seems to be pretty random. Is there a cache or something I can clear out for the EventGrid trigger? – ericOnline Oct 09 '20 at 16:04
  • 1
    for test purposes: set the *maxDeliveryAttempts = 3* and *dead-lettering*, which give you more details about the failed delivering, https://learn.microsoft.com/en-us/azure/event-grid/delivery-and-retry#retry-schedule-and-duration – Roman Kiss Oct 11 '20 at 06:15
  • 1
    This is still happening. Its quite annoying (especially because we are charged per Function call). Why does EventGrid keep firing after successfully delivering messages? – ericOnline Nov 03 '20 at 18:58
  • I finally figured out what the issue was...using the `BlobClient.from_blob_url` method worked fine when testing blob uploads using Azure Storage Explorer. But when using Azure Data Factory, the `data.url` property in the EventGrid message is not the actual blob url (contained `dfs` instead of `blob`). Oddly enough, after I brought this issue to Microsoft Support attention, a new `blobUrl` property was added to EventGrid `data` object. I simply changed "url" to "blobUrl" and the method succeeded. (I also improved the error handling of the Python code to accommodate such errors in the future.) – ericOnline Dec 10 '20 at 18:01

1 Answers1

1

I finally figured out what the issue was...using the BlobClient.from_blob_url method worked fine when testing blob uploads using Azure Storage Explorer. But when using Azure Data Factory, a different API is used and the data.url property in the EventGrid message is not the actual blob url (contained dfs instead of blob).

Oddly enough, soon after I brought this issue up to support team, a new blobUrl property was added to EventGrid data object.

In my code, I simply changed "url" to "blobUrl" and the method succeeded. (I also improved the error handling of the Python code to accommodate such errors in the future.)

Documented EventGrid message (as of 12/10/2020):

  • No blobUrl property
[{
  "topic": "/subscriptions/{subscription-id}/resourceGroups/Storage/providers/Microsoft.Storage/storageAccounts/my-storage-account",
  "subject": "/blobServices/default/containers/my-file-system/blobs/new-file.txt",
  "eventType": "Microsoft.Storage.BlobCreated",
  "eventTime": "2017-06-26T18:41:00.9584103Z",
  "id": "831e1650-001e-001b-66ab-eeb76e069631",
  "data": {
    "api": "CreateFile",
    "clientRequestId": "6d79dbfb-0e37-4fc4-981f-442c9ca65760",
    "requestId": "831e1650-001e-001b-66ab-eeb76e000000",
    "eTag": "\"0x8D4BCC2E4835CD0\"",
    "contentType": "text/plain",
    "contentLength": 0,
    "contentOffset": 0,
    "blobType": "BlockBlob",
    "url": "https://my-storage-account.dfs.core.windows.net/my-file-system/new-file.txt",
    "sequencer": "00000000000004420000000000028963",  
    "storageDiagnostics": {
    "batchId": "b68529f3-68cd-4744-baa4-3c0498ec19f0"
    }
  },
  "dataVersion": "2",
  "metadataVersion": "1"
}]

Actual EventGrid message coming through now:

  • Notice blobUrl has been added to the schema
{
    "id": "long-string",
    "data": {
        "api": "CreateFile",
        "requestId": "long-string",
        "eTag": "0x8D89B4FF7150079",
        "contentType": "application/octet-stream",
        "contentLength": 0,
        "contentOffset": 0,
        "blobType": "BlockBlob",
        "blobProperties": [{
            "acl": [{
                "access": "u::rw,u:long-string:rwx,u:long-string:rwx,g::rx,g:long-string:rx,m::rw,o::",
                "permission": "0660",
                "owner": "long-string",
                "group": "$superuser"
            }]
        }],
        "blobUrl": "https://myfunction.blob.core.windows.net/container/20201208/730420201208080239.csv",
        "url": "https://myfunction.dfs.core.windows.net/container/20201208/730420201208080239.csv",
        "sequencer": "0000000000000000000000000000692c00000000000e1a99",
        "identity": "long-string",
        "storageDiagnostics": {
            "batchId": "long-string"
        }
    },
    "topic": "/subscriptions/long-string/resourceGroups/myResourceGroup/providers/Microsoft.Storage/storageAccounts/storageAccount",
    "subject": "/blobServices/default/containers/container/blobs/20201208/730420201208080239.csv",
    "event_type": "Microsoft.Storage.BlobCreated"
}

Another caveat here...notice the Content-Length in the message above

  • The CreateFile API is not the actual message which indicates the blob has been created.
  • The FlushWithClose API is

There is a note in the docs about this. So I also had to setup an EventGrid Advanced Filter which only triggers when FlushWithClose events are generated (!)

enter image description here

ericOnline
  • 1,586
  • 1
  • 19
  • 54