0

I have the following JSON :

[
  {
    "id": "1",
    "foo": "bar-a",
    "hello": "world-a"
  },
  {
    "id": "2",
    "foo": "bar-b",
    "hello": "world-b"
  },
  {
    "id": "10",
    "foo": "bar-c",
    "hello": "world-c"
  },
  {
    "id": "42",
    "foo": "bar-d",
    "hello": "world-d"
  }
]

And I have the following array store in a variable: ["1", "2", "56", "1337"] (note the IDs are string, and may contain any regular character).

So, thanks to this SO, I found a way to filter my original data. jq 'jq '[.[] | select(.id == ("1", "2", "56", "1337"))]' ./data.json (note the array is surrounded by parentheses and not brackets) produces :

[
  {
    "id": "1",
    "foo": "bar-a",
    "hello": "world-a"
  },
  {
    "id": "2",
    "foo": "bar-b",
    "hello": "world-b"
  }
]

But I would also liked to do the opposite (basically excluding IDs instead of selecting them). Using select(.id != ("1", "2", "56", "1337")) doesn't work and using jq '[. - [.[] | select(.id == ("1", "2", "56", "1337"))]]' ./data.json seems very ugly and it doesn't work with my actual data (an output of aws ec2 describe-instances).

So have you any idea to do that? Thank you!

Community
  • 1
  • 1

2 Answers2

1

To include them, you need to verify that the id is any of the values in the keep set.

$ jq --argjson include '["1", "2", "56", "1337"]' 'map(select(.id == $include[]))' ...

To exclude them, you need to verify that all values are not in your excluded set. But it might just be easier to take the original set and remove the items that are in the excluded set.

$ jq --argjson exclude '["1", "2", "56", "1337"]' '. - map(select(.id == $exclude[]))' ...
Jeff Mercado
  • 129,526
  • 32
  • 251
  • 272
  • Hi, thank you for the answer. For the first one, naming the variable "$include" raises an error but naming it something else made it work (5 minutes to figure it). But still it's a better that the one mentioned in the linked SO. But for the exclude solution it's still a bit ugly I think, and more importantly it's doesn't work. As I said, my actual JSON data is returned by AWS CLI (so the object are pretty heavy). – Florentin Le Moal Jul 21 '16 at 22:04
  • @Florentin: you'll need to provide an example more representative of your situation then. The provided answer gets the result you desire for the given example. Also, there are other approaches but are a little more complicated to express which is why I took this approach. If you want a solution tailored to your needs, you'll need to provide more info. – Jeff Mercado Jul 24 '16 at 03:31
  • @JeffMercado yeah, i'll try to produce an anonymized version of my data so you can test with it. For now, since I'm iterating over all my objects, I just check at each iteration of the ID of the object is in my "only filter" or in my "ignore filter". Thank you nonetheless :) – Florentin Le Moal Jul 24 '16 at 11:23
1

Here is a solution that uses inside. Assuming you run jq as

jq -M --argjson IDS '["1","2","56","1337"]' -f filter.jq data.json

This filter.jq

map( select([.id] | inside($IDS)) )

produces the ids from data.json that are in the $IDS array:

[
  {
    "id": "1",
    "foo": "bar-a",
    "hello": "world-a"
  },
  {
    "id": "2",
    "foo": "bar-b",
    "hello": "world-b"
  }
]

and this filter.jq

map( select([.id] | inside($IDS) | not) )

produces the ids from data.json that are not in the $IDS array:

[
  {
    "id": "10",
    "foo": "bar-c",
    "hello": "world-c"
  },
  {
    "id": "42",
    "foo": "bar-d",
    "hello": "world-d"
  }
]
jq170727
  • 13,159
  • 3
  • 46
  • 56