2

We are running terraform through an Azure pipeline to create a databricks workspace and related resources, however when the apply stage of Terraform gets to the stage where it is grabbing the latest version of spark, the process throws an error.

Error is:

│ Error: default auth: cannot configure default credentials. Config: profile=DEFAULT, azure_client_secret=***, azure_client_id=***, azure_tenant_id=*****-*****. Env: ARM_CLIENT_SECRET, ARM_CLIENT_ID, ARM_TENANT_ID
│ 
│   with data.databricks_spark_version.latest_lts,
│   on databricks.tf line 33, in data "databricks_spark_version" "latest_lts":
│   33: data "databricks_spark_version" "latest_lts" {
│ 

We are using a service principal which has been created in Azure AD and has been given the account admin role in our databricks account

we've declared the databricks_connection_profile in a variables file:

databricks_connection_profile = "DEFAULT"

The part that appears to be at fault is the databricks_spark_version towards the bottom of this:

resource "azurerm_databricks_workspace" "dbw-uks" {
  name                = "dbw-uks"
  resource_group_name = azurerm_resource_group.rg-dataanalytics-uks-0002.name
  location            = azurerm_resource_group.rg-dataanalytics-uks-0002.location
  sku                 = "standard"

  depends_on = [
    azuread_service_principal.Databricks
  ]

  tags                = "${merge( local.common_tags, local.extra_tags)}"
}


output "databricks_host" {
  value = "https://${azurerm_databricks_workspace.dbw-uks.workspace_url}/"  
}

# #--------------- dbr-dataanalytics-uks-0002 Cluster ---------------#

data "databricks_node_type" "smallest" {
  local_disk = true

  depends_on = [
    azurerm_databricks_workspace.dbw-uks
  ]
}

data "databricks_spark_version" "latest_lts" {
   long_term_support = true

   depends_on = [
    azurerm_databricks_workspace.dbw-uks
  ]
}

We've followed through various tutorials from both Microsoft and Hashicorp but no positive results so far.

Alex Ott
  • 80,552
  • 8
  • 87
  • 132
Simon
  • 31
  • 1
  • 3
  • How do you populate the DEFAULT profile if you create a workspace from the same terraform template? – Alex Ott Apr 14 '23 at 15:42
  • I suspect we haven't, we can create the workspace using the Terraform Active Directory service principal which has admin rights in Databricks, but when we come to create the cluster, this needs a Spark version, all of the examples we've found so far, just say to use the databricks_connection_profile ="DEFAULT" – Simon Apr 17 '23 at 07:54

2 Answers2

1

Really, if you're creating Databricks workspace using the service principal, you can continue to use it to access/create Databricks resources & data sources. You don't need to specify databricks_connection_profile & just need to configure provider authentication correctly by providing host & other necessary attributes for service principal authentication - Databricks Terraform provider uses the same environment variables like azurerm provider.

Alex Ott
  • 80,552
  • 8
  • 87
  • 132
0

If you are using modules and also have multiple databricks providers in your providers, you need to explicitly pass the workspace provider. In our case we pass the provider to the module where we define the data.latest_lts_version this way:

module "foo" {
 ...
 providers = {
    databricks.workspace = databricks.workspace
 }
}

and then we can use latest_lts in the following way

// Inside module.foo
data "databricks_spark_version" "latest_lts" {
  provider          = databricks.workspace
  long_term_support = true
}

Hope this works out.

Jonas M.W.
  • 113
  • 7