4

Using jq how can I convert an array into object indexed by filename, or read multiple files into one object indexed by their filename?

e.g.

jq -s 'map(select(.roles[]? | contains ("mysql")))' -C dir/file1.json dir/file2.json

This gives me the data I want, but I need to know which file they came from.

So instead of

[
    { "roles": ["mysql"] },
    { "roles": ["mysql", "php"] }
]

for output, I want:

{
    "file1": { "roles": ["mysql"] },
    "file2": { "roles": ["mysql", "php"] }
}

I do want the ".json" file extension stripped too if possible, and just the basename (dir excluded).


Example

file1.json
{ "roles": ["mysql"] }
file2.json
{ "roles": ["mysql", "php"] }
file3.json
{ }

My real files obviously have other stuff in them too, but that should be enough for this example. file3 is simply to demonstrate "roles" is sometimes missing.

In other words: I'm trying to find files that contain "mysql" in their list of "roles". I need the filename and contents combined into one JSON object.


To simplify the problem further:

jq 'input_filename' f1 f2

Gives me all the filenames like I want, but I don't know how to combine them into one object or array.

Whereas,

jq -s 'map(input_filename)' f1 f2

Gives me the same filename repeated once for each file. e.g. [ "f1", "f1" ] instead of [ "f1", "f2" ]

peak
  • 105,803
  • 17
  • 152
  • 177
mpen
  • 272,448
  • 266
  • 850
  • 1,236

2 Answers2

4

If your jq has inputs (as does jq 1.5) then the task can be accomplished with just one invocation of jq. Also, it might be more efficient to use any than iterating over all the elements of .roles.

The trick is to invoke jq with the -n option, e.g.

jq -n '
  [inputs
   | select(.roles and any(.roles[]; contains("mysql")))
   | {(input_filename | gsub(".*/|\\.json$";"")): .}]
  | add' file*.json
peak
  • 105,803
  • 17
  • 152
  • 177
  • If I understand this right, `any(.roles[]; contains("mysql"))` actually means "if any role contains *the substring* 'mysql'" no? For exact equivalence isn't `any(.roles[]; . == "mysql")` more appropriate? – mpen Jul 09 '18 at 16:22
  • 1
    I only used "contains" because that is what was in the original question, and it gives equivalence thereto. However, if you require string equality, then of course you would use `==`. – peak Jul 09 '18 at 16:54
  • Ah...fair enough. My mistake then :-) I thought it was like an "array contains" when I first found it :D – mpen Jul 09 '18 at 21:19
2

jq approach:

jq 'if (.roles[] | contains("mysql")) then {(input_filename | gsub(".*/|\\.json$";"")): .}
    else empty end' ./file1.json ./file2.json | jq -s 'add'

The expected output:

{
  "file1": {
    "roles": [
      "mysql"
    ]
  },
  "file2": {
    "roles": [
      "mysql",
      "php"
    ]
  }
}
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105
  • Beautiful! Didn't know about `if`. Can also be done with `select`: `jq 'select (.roles[]? | contains("mysql")) | {(input_filename | gsub(".*/|\\.json$$";"")): .}' $^ | jq -s 'add'` (formatted for Make) – mpen Jul 06 '18 at 20:53