I have to process a JSON file that is very large (86 GB). I have tried a few different methods of parsing the file, but none of them completed without running out of memory or crashing my computer, and they also didn't seem to have the outcome I need anyway.
The input I have is a list of product keys, and the output I need is only the records in the JSON file that pertain to those product keys. Is it possible to read this JSON file and filter it for only the relevant records?
Here is the schema for the file:
{
"groups":
[
{
"groupID",
"model",
"groupname",
"productCodes",
"descriptors"
"externalIDs"
},
{groupID,...},
...
]
}
"productCodes" is an array that contains multiple product keys, like so:
"productCodes": [{
"type": "productkey",
"value": "DEBL6"
}, {
"type": "productkey",
"value": "GBAY4"
}, {
"type": "productkey",
"value": "GBAYE"
}, {
"type": "productkey",
"value": "GBQRF"
}, {
"type": "productkey",
"value": "GBZTD"
}, {
"type": "productkey",
"value": "ZA42A"
}
],