0

Anyone has a way to monitor a group of job clusters in Azure Databricks?

We just want to make sure the job cluster are up and running, maybe have a Dashboard or Workbook in Azure that can be red or green depending on the status of the job cluster.

We have this NRT interfaces pulling data from a source application via these job cluster and would like to see when they are down. We already get an alert when the service goes down but having a panel where we can see these interfaces would be really useful. Prhaps something that will make use of an API call would be needed unless there is something out of the box like those Ganglia reports bur haven't seen anything close to what I'm looking for.

Thanks in advance for any answer you may provide.

mac
  • 307
  • 2
  • 17
  • 1
    Have not used job clusters much - but a CLI command - `databricks clusters list` gives output in the format of `CLUSTER_ID_01 CLUSTER_NAME_01 RUNNING CLUSTER_ID_02 CLUSTER_NAME_02 TERMINATED` - there is also a [Clusters REST API](https://docs.databricks.com/dev-tools/api/latest/clusters.html) that you could use in a loop. Only caveat - I am not sure if Job Clusters would get listed...you could filter based on cluster name or id if available if `GET` command is used... – rainingdistros Feb 02 '23 at 08:23
  • Thanks, that's a good start and provides a path forward. – mac Feb 02 '23 at 16:47

1 Answers1

1

You can get the status of Azure Databricks Jobs by calling the API, refer below:-

Create a PAT Token like below:-

enter image description here

enter image description here

Copy the token and save it for use to call the API's in future.

I created one Databricks cluster and Job to run a Notebook like below:-

enter image description here

Ran the Job:-

enter image description here

Called the API to get the Job details like below:-

https://adb-xxxxxxxxxxxx8.18.azuredatabricks.net/api/2.1/jobs/list

Select Authorization as Bearer Token and add the PAT token that we generated above like below:-

enter image description here

Got output like below:-

enter image description here

You can configure this API and get the logs for monitoring the Job status.

You can directly check if your cluster is running or not in the event log of Azure Databricks like below:-

enter image description here

enter image description here

You can also configure Databricks logs in log4j and send it to Azure Monitor service for monitoring like below:-

enter image description here

You can send the above log4j logs to Azure log analytics too.

Additionally you can use ganglia and datadog to monitor Azure Databricks:-

enter image description here

References:-

Send Databricks app logs to Azure Monitor - Azure Architecture Center | Microsoft Learn

Manage clusters - Azure Databricks | Microsoft Learn

Jobs API 2.1 | Databricks on AWS

SiddheshDesai
  • 3,668
  • 1
  • 2
  • 11
  • Thanks!. This is really great information. Looking forward to use it with Azure Monitor. – mac Mar 02 '23 at 22:09