1

I’m loading a json file with jsondecode() in terraform, and I need to dynamically lookup a path in the json tree. Eg say I have the following json in file.json:

{
  "some1": {
     "path1": {
        "key1": value1
        "key2": value2
    }
  }
}

If I load this into a local called myjson then I could write local.myjson.some1.path1.key1 to get value 1.

But I need the path to be an input. The following does not work:

locals {
  tree = jsondecode("file.json")

  path = ["some1", "path1", "key1"]
  value = local.tree[local.path]
}

I looked at all the builtin functions in terraform, such as lookup, flatten, etc, I could not see any combination that would allow me to loop over elements of local.path2 to extract successively deeper elements of local.tree. Except try, works nicely but the max depth is hardcoded:

locals {
  level1 = try(local.json[local.path[0]], null)
  level2 = try(local.level1[local.path[1]], local.level1)
  level3 = try(local.level2[local.path[2]], local.level2)
  level4 = try(local.level3[local.path[3]], local.level3)
  ...
  result = try(local.levelN[local.path[N]], local.levelN)
}

so regardless of how many levels there actually are in the local.tree, result will contain it.

I can live with hardcoded N, but is there a better way, that does not have that limitation? (short of creating a custom provider that defines a data source that does this)

Oliver
  • 27,510
  • 9
  • 72
  • 103
  • It is not completely clear what the blocks of pseudo-code are attempting to accomplish here, but would you say the desired functionality is analogous to the dig method for the Hash class in Ruby: https://ruby-doc.org/core-3.1.2/Hash.html#method-i-dig? – Matthew Schuchard Jun 02 '22 at 11:08
  • @MattSchuchard the ruby `dig()` function is terraform's `lookup()` except that dig is more powerful, as it accepts a sequence of strings. If `lookup()` could do that, I could just use `lookup(loca.path...)` (the dot dot dot is an operator in terraform HCL) – Oliver Jun 02 '22 at 11:34
  • I doublechecked and this functionality is dependent upon third party packages in Golang. Therefore, this cannot be currently available in intrinsic Terraform HCL2 without the implementation of custom functions. Currently your path of least resistance would be an external data source https://registry.terraform.io/providers/hashicorp/external/latest/docs/data-sources/data_source where the inputs are the map and keys, and the output would be the value. – Matthew Schuchard Jun 02 '22 at 12:40
  • The external data source approach is a valid contender. It introduces an external dependency (on whatever the external data source calls, eg bash, python or go). A local_exec could also be used, with the same caveat. So these two approaches (external data source and local_exec) are a tradeoff over mine: no limit on nesting level (or rather, limited only by the external tool used), but they introduce external dependency on local host. – Oliver Jun 02 '22 at 12:52
  • Yes it is not a super great solution for those reasons and others, but it is your least bad (and possibly only) path forward here for the desired functionality. If you wanted to use Go, then a custom provider+data source (as you mentioned previously) becomes the least bad. `local-exec` with a `null-resource` would be less of a good fit than either of those. – Matthew Schuchard Jun 02 '22 at 13:19
  • @MattSchuchard if you mean least bad of all solutions discussed, I don't agree: having a hardcoded max nesting level, using pure HCL, is way safer IMO than introducing an external dependency, and if you are on a team, requiring everyone who will run terraform apply to have that external will be a pain. The second best to me would be a custom provider, but that is significantly more work than my solution. – Oliver Jun 02 '22 at 13:54
  • Oh I assumed Terraform was executing from within a pipeline, because that is true for a majority of organizations, and therefore the environment is standardized and managed within an agent. – Matthew Schuchard Jun 02 '22 at 14:30

1 Answers1

1

The Terraform language has no built-in functionality for this sort of arbitrary dynamic traversal.

As you noted in your question, it is possible in principle for a provider to offer this functionality. It wasn't clear to me whether you didn't want to use a provider at all or if you just didn't want to be the one to write it, and so just in case it was the latter I can at least offer a provider I already wrote and published which can potentially address this need, which is called apparentlymart/javascript and exposes a JavaScript interpreter into the Terraform language which you can use for arbitrary complex data manipulation:

terraform {
  required_providers {
    javascript = {
      source  = "apparentlymart/javascript"
      version = "0.0.1"
    }
  }
}

variable "traversal_path" {
  type = list(string)
}

data "javascript" "example" {
  source = <<-EOT
    for (var i = 0; i < path.length; i++) {
      data = data[path[i]]
    }
    data
  EOT

  vars = {
    data = jsondecode(file("${path.module}/file.json"))
    path = var.traversal_path
  }
}

output "result" {
  value = data.javascript.example.result
}

I can run this with different values of var.traversal_path to select different parts of the data structure in the JSON file:

$ terraform apply -var='traversal_path=["some1", "path1", "key1"]' -auto-approve
data.javascript.example: Reading...
data.javascript.example: Read complete after 0s

Changes to Outputs:
  + result = "value1"

You can apply this plan to save these new output values to the Terraform state, without changing any real infrastructure.

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

Outputs:

result = "value1"

$ terraform apply -var='traversal_path=["some1", "path1", "key2"]' -auto-approve
data.javascript.example: Reading...
data.javascript.example: Read complete after 0s

Changes to Outputs:
  ~ result = "value1" -> "value2"

You can apply this plan to save these new output values to the Terraform state, without changing any real infrastructure.

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

Outputs:

result = "value2"

$ terraform apply -var='traversal_path=["some1", "path1", "key3"]' -auto-approve
data.javascript.example: Reading...
data.javascript.example: Read complete after 0s

Changes to Outputs:
  - result = "value2" -> null

You can apply this plan to save these new output values to the Terraform state, without changing any real infrastructure.

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

I included the final example above to be explicit that escaping into JavaScript for this problem means adopting some of JavaScript's behaviors rather than Terraform's, and JavaScript handles looking up a non-existing object property by returning undefined rather than returning an error as Terraform would, and the javascript data source translates that undefined into a Terraform null. If you want to treat that as an error as Terraform would then you'd need to write some logic into the loop to test whether data is defined after each step. You can use the JavaScript throw statement to raise an error from inside the given script.

Of course it's not ideal to embed one language inside another like this, but since the Terraform language is intended for relatively straightforward declarations rather than general computation I think it's reasonable to use an escape-hatch like this if the overall problem fits within the Terraform language but there is one small part of it that would benefit from the generality of a general-purpose language.


Bonus chatter: if you prefer a more functional style to the for loop I used above then you can alternatively make use of the copy of Underscore.js that's embedded inside the provider, using _.propertyOf to handle the traversal in a single statement:

  source = <<-EOT
    _.propertyOf(data)(path)
  EOT
Martin Atkins
  • 62,420
  • 8
  • 120
  • 138
  • Thanks @martin for that solution. Using a full-fledged js interpreter is a little heavy handed for the problem as-is, but it's great to know that it's there if I need it. I wonder if you could extend your answer so that I can accept it: how does using that (specific) provider differ from just calling js interpreter in a local exec, are there pros and cons? Eg I think the js is embedded in your provider, so no platform dependency? But also means user has to trust your js and whether you keep it updated with security patches? – Oliver Jun 04 '22 at 13:12
  • When you say "in a local exec", are you talking about `provisioner "local-exec"`? Provisioners are not intended for this sort of data manipulation but I guess a similar comparison would be to the `external` data source from the `hashicorp/external` provider, which allows running an arbitrary external program as long as it produces a JSON object with all strings as values. – Martin Atkins Jun 06 '22 at 18:54
  • 1
    The JavaScript runtime in `apparentlymart/javascript` is, aside from the embedded Underscore.js bundle, just a vanilla JavaScript runtime without any browser-like or NodeJS-like supporting libraries for things like network requests, so I would not expect it to regularly need security updates but indeed if it does then you'd need to either rely on me to update it or fork the provider and update it yourself. – Martin Atkins Jun 06 '22 at 18:56
  • I meant any mechanism based on execution of local program ie `local-exec` provisioner, `external` provider, etc to run bash or nodejs or jq etc – Oliver Jun 07 '22 at 02:01