1

I am currently using an Azure Durable Function (Orchestrator) to orchestrate a process that involves creating a job in Azure Databricks. The Durable Function creates the job using the REST API of Azure Databricks and provides a callback URL. Once the job is created, the orchestrator waits indefinitely for the external event (callback) to be triggered, indicating the completion of the job (callback pattern). The job in Azure Databricks is wrapped in a try:except block to ensure that a status (success/failure) is reported back to the orchestrator no matter the outcome.

However, I am concerned about the scenario where the job status turns to Internal Error, and the piece of code is not executed, leaving the orchestrator waiting indefinitely. To ensure reliability, I am considering several solutions:

  • Setting a timeout on the orchestrator
  • Polling: Checking the status of the job every x minutes
  • Using an event-driven architecture by writing an event to a topic (e.g. Azure Event Grid) and having a separate service subscribe to it

My question is, can I send events to a topic (Azure Event Grid) when the Databricks job completes (succeeds, fails, errors, every possible outcome) to ensure that the orchestrator is notified and can take appropriate action? Looking at the REST API Jobs 2.1 docs, I can get notified via email or a specify a webhook on start, success and failure (Preview Feature). Can I enter the topic URL of Event Grid here so that Databricks writes events to it? Docs to manage notification destinations. It's not clear to me. Is there another way in Azure to achieve the same result?

Edit: I've looked into the documentation to find how to manage notification destinations and created a new system notification: enter image description here

However, when testing the connection: enter image description here

The request fails: enter image description here

401: Request must contain one of the following authorization signature: aeg-sas-token, aeg-sas-key. Report '12345678-7678-4ab9-b90f-37aabf1b10b8:7:1/23/2023 6:17:09 PM (UTC)' to our forums for assistance or raise a support ticket.

The same would happen if there was a POST request from another client (e.g. Postman). Now the question is: How can I provide a token so that Databricks can write events to a topic?

I've posted this question also here: Webhook Security (Bearer Auth)

Johannes Schmidt
  • 371
  • 3
  • 12
  • I believe it's possible to send events to an Azure Event Grid topic upon completing a Databricks job. You can use Azure Functions to create a custom event handler that listens for events from Databricks and sends them to the Event Grid topic. And then, use the Azure Functions HTTP trigger and specify the authentication type as Bearer Auth in the function code. – BhargavaGunnam-MSFT Feb 02 '23 at 00:59
  • 1
    @BhargavaGunnam-MSFT Thanks, since `System Notifications` in DataBricks do not yet support Bearer Auth (Jan, 2023), I will look into how I can use Azure Functions to listen for events coming from Databricks. Any resources you can point me to? – Johannes Schmidt Feb 02 '23 at 10:07
  • 1
    You can use Azure Functions to listen for events coming from Databricks. You can use the Azure Event Grid/hubs trigger for Azure Functions to listen for events from Databricks. Please see the below documents 1) https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-event-grid?tabs=in-process%2Cextensionv3&pivots=programming-language-csharp 2) https://learn.microsoft.com/en-us/azure/event-hubs/event-hubs-create – BhargavaGunnam-MSFT Feb 08 '23 at 23:05

0 Answers0