1

I am creating databricks cluster using terraform and would like to setup datadog on it whenever new cluster (master/worker nodes) gets created to push logs into datadog. How do we push logs in datadog?

I was trying below but not sure how to get values of some of variables marked with ? in below code/variables to pass to datadog init script also, script mentioned in datadog portal shows to install in python/notebook but I am using as script/bash in terraform so what content need to keep/pass in init script?

This code sample is to create cluster -

resource "databricks_cluster" "cluster" {
  cluster_name = abc
  ...
  
}

This code sample is to install datadog agent on driver/worker nodes as per this -

resource "databricks_global_init_script" "init-datadog" {
  source = data.template_file.global_init.rendered
  name = "install dd script"
}

data "template_file" "global_init" {
  template = "${file("${path.module}/script/datadog-install-driver.worker.sh")}"
  vars = {
    DD_API_KEY = xxx
    DD_ENV = dev
    DB_IS_DRIVER = true
    DB_DRIVER_IP = ? 
    DB_CLUSTER_ID = ?
    SPARK_LOCAL_IP = ?
    DB_DRIVER_PORT = ?
    hostile = ?
    DB_CLUSTER_NAME = ?
  }
}
Alex Ott
  • 80,552
  • 8
  • 87
  • 132
raj
  • 137
  • 1
  • 1
  • 10

1 Answers1

1

The variables like DB_IS_DRIVER, DB_CLUSTER_NAME, ... (full list is here) are automatically set when the init script is executed, so you don't need to set them in the terraform init script template.

P.S. The use of init scripts DBFS isn't recommended anymore - use the workspace init script type for that (it may require databricks_workspace_file resource that isn't released yet).

Alex Ott
  • 80,552
  • 8
  • 87
  • 132
  • Thanks for response @Alex Ott However, when I was not setting those variable values in template_file data block I get error as ```Error failed to render: Unknown variable; there is no variable named xx``` Also how do we pass datadog init script as dbfs or as mentioned in above example and exactly what content need to be in script file? If there is example/refernce would be great. thanks – raj May 10 '23 at 03:10
  • You shouldn’t treat these variables as template variables - maybe mask them or something like that – Alex Ott May 10 '23 at 04:39
  • Looks like adding extra $ sign for variable interpolation works i.e $${DB_DRIVER_PORT} but from this script which part of data/contents need to pass to global init script https://docs.datadoghq.com/integrations/databricks/?tab=allnodes#install-the-datadog-agent-on-driver-and-worker-nodes ```#!/bin/bash cat < /tmp/start_datadog.sh #!/bin/bash date -u +"%Y-%m-%d %H:%M:%S UTC" echo "Running on the driver? $DB_IS_DRIVER" echo "Driver ip: $DB_DRIVER_IP" .... ``` when copied this content from DD site getting error as ```Error:file #!/bin/bash``` – raj May 12 '23 at 03:35
  • Set tracing in your script and enable cluster logs so you can see what went wrong with your init script – Alex Ott May 12 '23 at 05:59