I have the same scenario as jbyler - I want to parse N lines of a JSON log, where each line is a single object:
host ~ # head /var/log/nginx/access.log.json | cut -c 1-80
{"remote_addr":"127.0.0.1","remote_port":"47700","time_iso8601":"2022-08-31T00:0
{"remote_addr":"127.0.0.1","remote_port":"35576","time_iso8601":"2022-08-31T00:0
{"remote_addr":"127.0.0.1","remote_port":"47708","time_iso8601":"2022-08-31T00:0
{"remote_addr":"127.0.0.1","remote_port":"52974","time_iso8601":"2022-08-31T00:0
{"remote_addr":"127.0.0.1","remote_port":"52976","time_iso8601":"2022-08-31T00:0
{"remote_addr":"127.0.0.1","remote_port":"51414","time_iso8601":"2022-08-31T00:0
{"remote_addr":"127.0.0.1","remote_port":"41942","time_iso8601":"2022-08-31T00:1
{"remote_addr":"127.0.0.1","remote_port":"41946","time_iso8601":"2022-08-31T00:1
{"remote_addr":"127.0.0.1","remote_port":"37982","time_iso8601":"2022-08-31T00:1
{"remote_addr":"127.0.0.1","remote_port":"56602","time_iso8601":"2022-08-31T00:1
I do not like the --slurp
solution because this will cause jq
to read the entire file, when I might only be interested in the first N lines.
The solution for this seems to be the input
function combined with jq -n
, which reads a single input object:
host ~ # cat /var/log/nginx/access.log.json | jq -r -n 'input | [.remote_addr, .remote_port, .time_iso8601] | @tsv'
127.0.0.1 47700 2022-08-31T00:02:53+02:00
Combined with range
, I can use it to read up to N input objects:
host ~ # cat /var/log/nginx/access.log.json | jq -r -n 'range(10) as $i | input | [.remote_addr, .remote_port, .time_iso8601] | @tsv'
127.0.0.1 47700 2022-08-31T00:02:53+02:00
127.0.0.1 35576 2022-08-31T00:02:53+02:00
127.0.0.1 47708 2022-08-31T00:02:53+02:00
127.0.0.1 52974 2022-08-31T00:07:53+02:00
127.0.0.1 52976 2022-08-31T00:07:53+02:00
127.0.0.1 51414 2022-08-31T00:07:58+02:00
127.0.0.1 41942 2022-08-31T00:12:53+02:00
127.0.0.1 41946 2022-08-31T00:12:53+02:00
127.0.0.1 37982 2022-08-31T00:13:03+02:00
127.0.0.1 56602 2022-08-31T00:17:53+02:00
Be careful with wrapping this into an array though - doing it wrong will cause jq
to reevaluate the [range(10) as $i | input]
expression, so here I'm not getting elements [0, 1, 2]
, but [0, 11, 22]
:
# WRONG, DON'T DO THIS
host ~ # cat /var/log/nginx/access.log.json | jq -r -n '[range(10) as $i | input][0,1,2] | [.remote_addr, .remote_port, .time_iso8601] | @tsv'
127.0.0.1 47700 2022-08-31T00:02:53+02:00
127.0.0.1 59784 2022-08-31T00:18:22+02:00
127.0.0.1 34316 2022-08-31T00:37:53+02:00
To safely access the same array multiple times, you need to have a pipe in between:
host ~ # cat /var/log/nginx/access.log.json | jq -r -n '[range(10) as $i | input] | .[0,1,2] | [.remote_addr, .remote_port, .time_iso8601] | @tsv'
127.0.0.1 47700 2022-08-31T00:02:53+02:00
127.0.0.1 35576 2022-08-31T00:02:53+02:00
127.0.0.1 47708 2022-08-31T00:02:53+02:00
This way, you can take up to N objects from the input file, and then extract arbitrary ones from this selection.