0

I have a service principle (enterprise application), which is granted access into many azure tenants, to allow deploying infrastructure into different azure accounts/subscriptions.

For deploying Azure resource via terraform, I am creating roles for each module. For instance to deploy a Virtual Network, I am creating a role with the necessary permissions. With that, I am running into an issue where on terraform apply the role gets created and assigned to the service principle, but then one of the resources fails due to not having permissions. This however is fixed by running another terraform apply right after.

Below is a quick example of the issue I am facing, and hope that someone has viable solution. I have tried adding in null resource that waits for 10 minutes, but whether it's 2 seconds or 10 minutes there's always a permissions error. I am unable to use a blanket role like Contributor to resolve this issue.

main.tf

terraform {
  required_version = "1.4.6"

  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 3.58.0"
    }
  }
}

provider "azurerm" {
  features {
    resource_group {
      prevent_deletion_if_contains_resources = false
    }
  }
  skip_provider_registration = "true"
  tenant_id                  = "<tenant_id>" 
  subscription_id            = "<subscription_id>" 
  client_id                  = "<service_principle_id>"
  client_secret              = "<client_secret>"
}

resource "azurerm_role_definition" "vpc_role" {
  name        = "VPCRole"
  description = "Role to deploy and manage VPC + Network"
  scope       = "/subscriptions/<subscription_id>"

  permissions {
    actions = [
      "Microsoft.Resources/subscriptions/resourceGroups/read",
      "Microsoft.Resources/subscriptions/resourceGroups/write",
      "Microsoft.Resources/subscriptions/resourceGroups/delete",
      "Microsoft.Network/natGateways/read",
      "Microsoft.Network/natGateways/write",
      "Microsoft.Network/natGateways/delete",
      "Microsoft.Network/natGateways/join/action",
      "Microsoft.Network/publicIPAddresses/read",
      "Microsoft.Network/publicIPAddresses/write",
      "Microsoft.Network/publicIPAddresses/delete",
      "Microsoft.Network/publicIPAddresses/join/action",
      "Microsoft.Network/virtualNetworks/read",
      "Microsoft.Network/virtualNetworks/write",
      "Microsoft.Network/virtualNetworks/delete",
      "Microsoft.Network/virtualNetworks/subnets/read",
      "Microsoft.Network/virtualNetworks/subnets/write",
      "Microsoft.Network/virtualNetworks/subnets/delete",
      "Microsoft.Network/virtualNetworks/subnets/join/action",
      "Microsoft.Resources/subscriptions/resourceGroups/read",
      "Microsoft.Resources/subscriptions/resourceGroups/write",
      "Microsoft.Resources/subscriptions/resourceGroups/delete",
    ]
  }
}

data "azurerm_subscription" "primary" {
}

data "azurerm_client_config" "current" {
}

resource "azurerm_role_assignment" "vpc_role_assignment" {
  scope                            = data.azurerm_subscription.primary.id
  role_definition_id               = azurerm_role_definition.vpc_role.role_definition_resource_id
  principal_id                     = data.azurerm_client_config.current.object_id # needs to be object_id or else it comes up identity not found
  skip_service_principal_aad_check = true

  depends_on = [ azurerm_role_definition.vpc_role ]
}

resource "azurerm_resource_group" "example" {
  name     = "example-resource-group"
  location = "eastus"

  depends_on = [ azurerm_role_assignment.vpc_role_assignment ]
}


resource "azurerm_virtual_network" "example" {
  name                = "example-vnet"
  address_space       = ["10.0.0.0/16"]
  location            = azurerm_resource_group.example.location
  resource_group_name = azurerm_resource_group.example.name
}

resource "azurerm_subnet" "example" {
  name                 = "example-subnet"
  resource_group_name  = azurerm_resource_group.example.name
  virtual_network_name = azurerm_virtual_network.example.name
  address_prefixes     = ["10.0.1.0/24"]
}

resource "azurerm_public_ip" "nat_ip_1" {
  name                = "example-nat-ip-1"
  location            = azurerm_resource_group.example.location
  resource_group_name  = azurerm_resource_group.example.name

  allocation_method   = "Static"
  sku                 = "Standard"
}

resource "azurerm_nat_gateway" "nat_gw_1" {
  name                = "example-nat-gw-1"
  location            = azurerm_resource_group.example.location
  resource_group_name  = azurerm_resource_group.example.name
}

resource "azurerm_nat_gateway_public_ip_association" "ng_ip_1_assoc" {
  nat_gateway_id       = azurerm_nat_gateway.nat_gw_1.id
  public_ip_address_id = azurerm_public_ip.nat_ip_1.id
}

Output

Plan: 8 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

azurerm_role_definition.vpc_role: Creating...
azurerm_role_definition.vpc_role: Creation complete after 2s [id=/subscriptions/<sub_id>/providers/Microsoft.Authorization/roleDefinitions/eb2c6f71-162e-a62b-5d15-4c4b8a0ba0ee|/subscriptions/<sub_id>]
azurerm_role_assignment.vpc_role_assignment: Creating...
azurerm_role_assignment.vpc_role_assignment: Still creating... [10s elapsed]
azurerm_role_assignment.vpc_role_assignment: Still creating... [20s elapsed]
azurerm_role_assignment.vpc_role_assignment: Creation complete after 23s [id=/subscriptions/<sub_id>/providers/Microsoft.Authorization/roleAssignments/5d4d4ee3-6822-9af7-66ad-3550dc1ab9c5]
azurerm_resource_group.example: Creating...
╷
│ Error: checking for presence of existing resource group: resources.GroupsClient#Get: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailed" Message="The client '<client_id>' with object id '<object_id>' does not have authorization to perform action 'Microsoft.Resources/subscriptions/resourcegroups/read' over scope '/subscriptions/<sub_id>/resourcegroups/example-resource-group' or the scope is invalid. If access was recently granted, please refresh your credentials."
│
│   with azurerm_resource_group.example,
│   on main.tf line 72, in resource "azurerm_resource_group" "example":
│   72: resource "azurerm_resource_group" "example" {
│
╵

Expected Behaviour

The expected behavior is that after role creation -> role assignment, any resources that are dependent on the role being assigned to service principle used within the provider, should be able to deploy resources

Gorgon_Union
  • 563
  • 2
  • 8
  • 24

1 Answers1

0

I tried to reproduce the same issue in my environment by executing your code. However, I encountered the issue intermittently. Sometimes the error occurred, while other times it did not. To resolve this problem, you need to allow some time for the resources to be created after assigning the role. It's possible that the role assignment does not take effect immediately. Therefore, to mitigate this error, I have added a time_sleep terraform resource after the role assignment. Please find the complete code below..

terraform {
  required_version = "1.4.6"

  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 3.58.0"
    }
  }
}

provider "azurerm" {
  features {
    resource_group {
      prevent_deletion_if_contains_resources = false
    }
  }
  skip_provider_registration = "true"
  tenant_id                  = "" 
  subscription_id            = "" 
  client_id                  = ""
  client_secret              = ""
}

resource "azurerm_role_definition" "vpc_role" {
  name        = "VPCRole03"
  description = "Role to deploy and manage VPC + Network"
  scope       = "/subscriptions/7195d375-7af2-43f1-bd66-12e77ac05818"

  permissions {
    actions = [
      "Microsoft.Resources/subscriptions/resourceGroups/read",
      "Microsoft.Resources/subscriptions/resourceGroups/write",
      "Microsoft.Resources/subscriptions/resourceGroups/delete",
      "Microsoft.Network/natGateways/read",
      "Microsoft.Network/natGateways/write",
      "Microsoft.Network/natGateways/delete",
      "Microsoft.Network/natGateways/join/action",
      "Microsoft.Network/publicIPAddresses/read",
      "Microsoft.Network/publicIPAddresses/write",
      "Microsoft.Network/publicIPAddresses/delete",
      "Microsoft.Network/publicIPAddresses/join/action",
      "Microsoft.Network/virtualNetworks/read",
      "Microsoft.Network/virtualNetworks/write",
      "Microsoft.Network/virtualNetworks/delete",
      "Microsoft.Network/virtualNetworks/subnets/read",
      "Microsoft.Network/virtualNetworks/subnets/write",
      "Microsoft.Network/virtualNetworks/subnets/delete",
      "Microsoft.Network/virtualNetworks/subnets/join/action",
      "Microsoft.Resources/subscriptions/resourceGroups/read",
      "Microsoft.Resources/subscriptions/resourceGroups/write",
      "Microsoft.Resources/subscriptions/resourceGroups/delete",
    ]
  }
}

data "azurerm_subscription" "primary" {
}

data "azurerm_client_config" "current" {
}

resource "azurerm_role_assignment" "vpc_role_assignment" {
  scope                            = data.azurerm_subscription.primary.id
  role_definition_id               = azurerm_role_definition.vpc_role.role_definition_resource_id
  principal_id                     = data.azurerm_client_config.current.object_id # needs to be object_id or else it comes up identity not found
  skip_service_principal_aad_check = true

  depends_on = [ azurerm_role_definition.vpc_role ]
}

resource "time_sleep" "wait_60_seconds" {
  create_duration = "60s"
  depends_on = [ 
    azurerm_role_assignment.vpc_role_assignment
    
  ]
}

resource "azurerm_resource_group" "example" {
  name     = "example-resource-group-03"
  location = "eastus"

  depends_on = [ 
    azurerm_role_assignment.vpc_role_assignment,
    time_sleep.wait_60_seconds
    ]
}


resource "azurerm_virtual_network" "example" {
  name                = "example-vnet-03"
  address_space       = ["10.0.0.0/16"]
  location            = azurerm_resource_group.example.location
  resource_group_name = azurerm_resource_group.example.name
}

resource "azurerm_subnet" "example" {
  name                 = "example-subnet-03"
  resource_group_name  = azurerm_resource_group.example.name
  virtual_network_name = azurerm_virtual_network.example.name
  address_prefixes     = ["10.0.1.0/24"]
}

resource "azurerm_public_ip" "nat_ip_1" {
  name                = "example-nat-ip-1-03"
  location            = azurerm_resource_group.example.location
  resource_group_name  = azurerm_resource_group.example.name

  allocation_method   = "Static"
  sku                 = "Standard"
}

resource "azurerm_nat_gateway" "nat_gw_1" {
  name                = "example-nat-gw-1-03"
  location            = azurerm_resource_group.example.location
  resource_group_name  = azurerm_resource_group.example.name
}

resource "azurerm_nat_gateway_public_ip_association" "ng_ip_1_assoc" {
  nat_gateway_id       = azurerm_nat_gateway.nat_gw_1.id
  public_ip_address_id = azurerm_public_ip.nat_ip_1.id
}

Output: https://i.imgur.com/c30gSN8.png

HowAreYou
  • 605
  • 2
  • 6
  • Thanks for the response, and this does seem intermittent for others but unfortunately this doesn't resolve the issue. I've stated already having tried setting a `null_resource` with `sleep 60` up to `sleep 500` and the issue persists. If it's not the resource group that causes the error, it's any other resource that follows the resource group creation. This seems more of an Azure / provider issue with eventual consistency on the service principle and role being attached, or some caching issue on Azure roles maybe? – Gorgon_Union Jul 12 '23 at 14:03
  • @Gorgon_Union, thanks for your reply. Did you try adding the dependent_on to the time_sleep resource and azurerm_resource_group resource as shown above? – HowAreYou Jul 13 '23 at 06:07