Would like to SLICE a huge json file ~20GB into smaller chunk of data based on array size (10000/50000 etc)..
Input:
{"recDt":"2021-01-05",
"country":"US",
"name":"ABC",
"number":"9828",
"add": [
{"evnCd":"O","rngNum":"1","state":"TX","city":"ANDERSON","postal":"77830"},
{"evnCd":"O","rngNum":"2","state":"TX","city":"ANDERSON","postal":"77832"},
{"evnCd":"O","rngNum":"3","state":"TX","city":"ANDERSON","postal":"77831"},
{"evnCd":"O","rngNum":"4","state":"TX","city":"ANDERSON","postal":"77834"}
]
}
Currently running in a loop to get the desire output by incrementing x/y value, but performance is very slow and takes very 8-20 seconds for a iteration depends on size of the file to complete the split process. Currently using 1.6 version, is there any alternates for getting below result
Expected Output: for Slice of 2 objects in array
{"recDt":"2021-01-05","country":"US","name":"ABC","number":"9828","add":[{"rngNum":"1","state":"TX","city":"ANDERSON","postal":"77830"},{"rngNum":"2","state":"TX","city":"ANDERSON","postal":"77832"}]}
{"recDt":"2021-01-05","country":"US","name":"ABC","number":"9828","add":[{"rngNum":"3","state":"TX","city":"ANDERSON","postal":"77831"},{"rngNum":"4","state":"TX","city":"ANDERSON","postal":"77834"}]}
Tried with
cat $inFile | jq -cn --stream 'fromstream(1|truncate_stream(inputs))' | jq --arg x $x --arg y $y -c '{recDt: .recDt, country: .country, name: .name, number: .number, add: .add[$x|tonumber:$y|tonumber]}' >> $outFile
cat $inFile | jq --arg x $x --arg y $y -c '{recDt: .recDt, country: .country, name: .name, number: .number, add: .add[$x|tonumber:$y|tonumber]}' >> $outFile
Please share if there are any alternate available..