1

I have a very large JSON file and the sample data looks like as:

{"userActivities":{"L3ATRosRdbDgSmX75Z":{"deviceId":"60ee32c2fae8dcf0","dow":"Friday","localDate":"2018-01-19"},"L3ATSFGrpAYRkIIKqrh":{"deviceId":"60ee32c2fae8dcf0","dow":"Friday","localDate":"2018-01-20"}}}

I need to put a filter on "localDate" field in jq stream such that the output looks like the following JSON as:

{"L3ATSFGrpAYRkIIKqrh":{"deviceId":"60ee32c2fae8dcf0","dow":"Friday","localDate":"2018-01-19"}}

Any help/guidance is greatly appreciated!

Sains
  • 457
  • 1
  • 7
  • 19

2 Answers2

1

The simplest way to retain the key-value pair while making a selection based on the values is to use with_entries:

jq '.userActivities
| with_entries(select(.value.localDate=="2018-01-20"))' input.json

Output

{
  "L3ATSFGrpAYRkIIKqrh": {
    "deviceId": "60ee32c2fae8dcf0",
    "dow": "Friday",
    "localDate": "2018-01-20"
  }
}
peak
  • 105,803
  • 17
  • 152
  • 177
  • I am working on a very large JSON of around 20 GB so looking for a solution using jq stream – Sains Sep 14 '18 at 04:14
  • Question clearly states the answer needs to use `--stream`. – tilde Jun 08 '19 at 02:05
  • @tilde - The original question did not specify `--stream` but did mention streaming. In the context of jq, there is a distinction between the so-called "streaming parser" (which is enabled by the --stream command-line option), and a "stream" of JSON texts. The original question mentioned a large file, without specifying whether it was a single JSON text or a stream of JSON texts. If the latter, then using the streaming parser would most probably be too slow. – peak Jun 08 '19 at 03:01
1
jq -cn --stream 'fromstream(1|truncate_stream(inputs | select(.[0][] == "userActivities"))) | with_entries(select(.value.localDate=="2018-01-19")) ' input.json

output:

{"L3ATRosRdbDgSmX75Z":{"deviceId":"60ee32c2fae8dcf0","dow":"Friday","localDate":"2018-01-19"}}
gregory
  • 10,969
  • 2
  • 30
  • 42
  • Thanks a ton, it works. Just wanted to check if there is an option available wherein filter can be put on multiple values of the same variable. For Example: localDate in ("2018-01-19", "2018-01-20")? – Sains Sep 14 '18 at 06:44
  • Yes. But the syntax is like this: `jq -cn --stream 'fromstream(1|truncate_stream(inputs | select(.[0][] == "userActivities"))) | with_entries(select(.value.localDate == ("2018-01-19") or .value.localDate == ("2018-01-20")))` – gregory Sep 14 '18 at 17:15