3

I have a json file that is 2.37 gb with about 2.1 million records. I wanted to use jq to go through the file and create a new file every 100000 records.

I.e

part1.json part2.json part3.json part4.json part5.json etc

Has anyone done this with jq?

oharr
  • 163
  • 1
  • 3
  • 12
  • 1
    is there a problem with using the file as it is? Won't the files have to be all put back together to make a coherent object / array to be used eventually? A piece of valid JSON is always one of those two things...if you split it up every x records then you'd presumably have to split into multiple arrays...would it then still have the same meaning afterwards? Maybe it doesn't matter for your purpose, but just something to consider. – ADyson Aug 21 '18 at 21:16
  • @oharr - I believe you've already asked this question at https://github.com/stedolan/jq/issues/1712; an answer was given there, with the suggestion that if you have further questions, you should provide further details. Please also see [mcve] – peak Aug 21 '18 at 21:20
  • See https://stackoverflow.com/questions/49808581/using-jq-how-can-i-split-a-very-large-json-file-into-multiple-files-each-a-spec – peak Aug 21 '18 at 21:21
  • Does this answer your question? [Using jq how can I split a very large JSON file into multiple files, each a specific quantity of objects?](https://stackoverflow.com/questions/49808581/using-jq-how-can-i-split-a-very-large-json-file-into-multiple-files-each-a-spec) – tripleee May 08 '23 at 06:28

1 Answers1

5

Well you could use jq in conjunction with split to write those files.

$ jq -nc --stream 'fromstream(1|truncate_stream(inputs))' large_file.json |
    split -dl 100000 -additional-suffix=.json - part
Jeff Mercado
  • 129,526
  • 32
  • 251
  • 272
  • Does the job, with a minor correction, you have to pass `--additional-sufix`, not `-additional-suffix`, i.e with two `-` characters. – Chris Mar 27 '23 at 08:17