93

I like to filter json files using jq:

jq . some.json

Given the json containing an array of objects:

{
  "theList": [
    {
      "id": 1,
      "name": "Horst"
    },
    {
      "id": 2,
      "name": "Fritz"
    },
    {
      "id": 3,
      "name": "Walter"
    },
    {
      "id": 4,
      "name": "Gerhart"
    },
    {
      "id": 5,
      "name": "Harmut"
    }
  ]
}

I want to filter that list to only show the elements with id having the value 2 and 4, so the expected output is:

{
  "id": 2,
  "name": "Fritz"
},
{
  "id": 4,
  "name": "Gerhart"
}

How do I filter the json using jq? I have played around with select and map, yet didn't got any of those to work, e.g.:

$ jq '.theList[] | select(.id == 2) or select(.id == 4)' array.json
true
k0pernikus
  • 60,309
  • 67
  • 216
  • 347

4 Answers4

137

From the docs:

jq '.[] | select(.id == "second")' 

Input [{"id": "first", "val": 1}, {"id": "second", "val": 2}]

Output {"id": "second", "val": 2}

I think you can do something like this:

jq '.theList[] | select(.id == 2 or .id == 4)' array.json
André Senra
  • 1,875
  • 2
  • 11
  • 13
22

You could use select within map.

.theList | map(select(.id == (2, 4)))

Or more compact:

[ .theList[] | select(.id == (2, 4)) ]

Though written that way is a little inefficient since the expression is duplicated for every value being compared. It'll be more efficient and possibly more readable written this way:

[ .theList[] | select(any(2, 4; . == .id)) ]
Jeff Mercado
  • 129,526
  • 32
  • 251
  • 272
  • Clever use of [if behavior when condition returns multiple results](https://github.com/stedolan/jq/blob/master/docs/content/3.manual/manual.yml#L2033-L2035)! – jq170727 Aug 28 '17 at 00:30
8

Using select(.id == (2, 4)) here is generally inefficient (see below).

If your jq has IN/1, then it can be used to achieve a more efficient solution:

.theList[] | select( .id | IN(2,3))

If your jq does not have IN/1, then you can define it as follows:

def IN(s): first(select(s == .)) // false;

Efficiency

One way to see the inefficiency is to use debug. The following expression, for example, results in 10 calls to debug, whereas only 9 checks for equality are actually needed:

.theList[] | select( (.id == (2,3)) | debug )

["DEBUG:",false]
["DEBUG:",false]
["DEBUG:",true]
{
  "id": 2,
  "name": "Fritz"
}
["DEBUG:",false]
["DEBUG:",false]
["DEBUG:",true]
{
  "id": 3,
  "name": "Walter"
}
["DEBUG:",false]
["DEBUG:",false]
["DEBUG:",false]
["DEBUG:",false]

index/1

In principle, using index/1 should be efficient, but as of this writing (October 2017), its implementation, though fast (it is written in C), is inefficient.

peak
  • 105,803
  • 17
  • 152
  • 177
0

Here is a solution using indices:

.theList | [ .[map(.id)|indices(2,4)[]] ]
jq170727
  • 13,159
  • 3
  • 46
  • 56