0

My Terraform code is broadly architected like so:

module "firewall_hub" {
  # This creates the Azure Firewall resource

  source      = "/path/to/module/a"
  # attribute = value...
}

module "firewall_spoke" {
  # This creates, amongst other things, firewall rule sets

  source      = "/path/to/module/b"
  hub         = module.firewall_hub
  # attribute = value...
}

module "another_firewall_spoke" {
  # This creates, amongst other things, firewall rule sets

  source      = "/path/to/module/c"
  hub         = module.firewall_hub
  # attribute = value...
}

i.e., The Azure Firewall resource is created in module.firewall_hub, which is used as an input into module.firewall_spoke and module.another_firewall_spoke that create their necessary resources and inject firewall rule sets into the Firewall resource. Importantly, the rule sets are mutually exclusive between spoke modules and designed such that their priorities don't overlap.

When I try to deploy this code (either build or destroy), Azure throws an error:

Error: deleting Application Rule Collection "XXX" from Firewall "XXX (Resource Group "XXX"): network.AzureFirewallsClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status= Code="AnotherOperationInProgress" Message="Another operation on this or dependent resource is in progress. To retrieve status of the operation use uri: https://management.azure.com/subscriptions/XXX" Details=[]

My working hypothesis is that one cannot make multiple create/update/delete requests of firewall rule sets against the same firewall simultaneously, even if the rule sets are mutually exclusive. Indeed, if you wait a minute-or-so after the failed deployment and restart it -- without changing any Terraform code or manually updating resources in Azure -- it will happily carry on without error and complete successfully.

To test my assumption, I tried to workaround this by forcing serialisation of modules:

module "another_firewall_spoke" {
  # This creates, amongst other things, firewall rule sets

  source      = "/path/to/module/c"
  hub         = module.firewall_hub
  # attribute = value...

  depends_on = [module.firewall_spoke]
}

However, unfortunately, this is not possible with the way my modules are written:

Providers cannot be configured within modules using count, for_each or depends_on.

Short of rewriting my modules (not an option), is it possible to get around this race condition -- if that's the problem -- or would you consider it a bug with the azurerm provider (i.e., it should recognise that API error response and wait its turn, up to some timeout)?

(Terraform v1.1.7, azurerm v2.96.0)

Xophmeister
  • 8,884
  • 4
  • 44
  • 87

1 Answers1

1

Following @silent's tip-off to this answer, I was able to resolve the race using the method described therein.

Something like this:

module "firewall_hub" {
  # This creates the Azure Firewall resource

  source      = "/path/to/module/a"
  # attribute = value...
}

module "firewall_spoke" {
  # This creates, amongst other things, firewall rule sets
  # Has an output "blockers" containing resources that cannot be deployed concurrently

  source      = "/path/to/module/b"
  hub         = module.firewall_hub
  # attribute = value...
}

module "another_firewall_spoke" {
  # This creates, amongst other things, firewall rule sets

  source      = "/path/to/module/c"
  hub         = module.firewall_hub
  waits_for   = module.firewall_spoke.blockers
  # attribute = value...
}

So the trick is for your modules to export an output that contains a list of all the dependent resources that need to be deployed first. This can then be an input to subsequent modules, that is threaded through to the actual resources that require a depends_on value.

That is, in the depths of my module, resources have:

resource "some_resource" "foo" {
  # attribute = value...

  depends_on = [var.waits_for]
}

There are two important notes to bear in mind when using this method:

  1. The wait_for variable in your module must have type any; list(any) doesn't work, as Terraform interprets this as a homogeneous list (which it most likely won't be).

  2. Weirdly, imo, the depends_on clause requires you to explicitly use a list literal (i.e., [var.waits_for] rather than just var.waits_for), even if the variable you are threading through is a list. This doesn't type check in my head, but apparently Terraform is not only fine with it, but it expects it!

Xophmeister
  • 8,884
  • 4
  • 44
  • 87