3

Trying to wrap my head around this issue for a while - I have a JSON input which contains an array, say something like this:

{
    "array" : [
        {"foo": "bar"},
        {"foo": "buzz"},
        {"misbehaving": "object"}
    ]
}

My goal is to verify that all of the objects in the array satisfy the condition of having a field named foo (actual use-case is to make sure that all resources in cloud deployment have tags). My issue is that standard rego expressions are evaluated as "at least" and not "all", which means that expressions like:

all_have_foo_field {
    input.array.foo
}

Are always returning true, even though some objects do not satisfy this. I've looked at this, but evaluating a regex returns true or false while my policy checks if field exists, meaning if it does not I get a 'var_is_unsafe' error.

Any ideas?

SQB
  • 3,926
  • 2
  • 28
  • 49
FitzChivalry
  • 339
  • 2
  • 19
  • 1
    I've found a rather odd way, which is to create a temporary array which includes all objects which have the field, and then compare its length to the length of the original array. This seems rather not intuitive... is there a nicer way? – FitzChivalry May 19 '20 at 15:26

1 Answers1

10

There are two ways to say "all fields of elements in X must match these conditions" (FOR ALL).

TLDR:

all_have_foo_field {
  # use negation and a helper rule
  not any_missing_foo_field
}

any_missing_foo_field {
  some i
  input.array[i]
  not input.array[i].foo
}

OR

all_have_foo_field {
  # use a comprehension
  having_foo := {i | input.array[i].foo}
  count(having_foo) == count(input.array)
}

The approach depends on the us case. If you want to know what elements do not satisfy the conditions, the comprehension is nice because you can use set arithmetic, e.g., {i | input.array[i]} - {i | input.array[i].foo} produces the set of array indices that do not have the field "foo". You probably want to assign these expressions to local variables for readability. See this section in the docs for more detail: https://www.openpolicyagent.org/docs/latest/policy-language/#universal-quantification-for-all.

In this case (as opposed to the answer you linked to) we don't have to use regex or anything like that since references to missing/undefined fields results in undefined and undefined propagates outward to the expression, query, rule, etc. This is covered to some extent in the Introduction.

All we have to do then is just refer to the field in question. Note, technically not input.array[i].foo would be TRUE if the "foo" field value false however in many cases undefined and false can be treated as interchangeable (they're not quite the same--false is a valid JSON value whereas undefined represents the lack of a value.) If you need to only match undefined then you have to assign the result of the reference to a local variable. In the comprehension case we can write:

# the set will contain all values i where field "foo" exists regardless
{i | _ = input.array[i].foo}

In the negation case we need an additional helper rule since not _ = input.array[i].foo would be "unsafe". We can write:

exists(value, key) { value[key] = _ }`

And now not exists(input[i], "foo") is only TRUE when the field "foo" is missing.

Note, differentiating between undefined and false is often not worth it--I recommend only doing so when necessary.

tsandall
  • 1,544
  • 8
  • 8