I'm working with a web response of JSON that looks like this (simplified, and I can't change the format):
[
{ "type": "0","key1": 3, "key2": 5},
{ "type": "1","key3": "a", "key4": "b"},
{ "type": "2", "data": [<very big array here>] }
]
I want to do two things:
- Inspect the first two objects without reading everything to memory, I can do this by using Ijson:
parsed = ijson.items(res.raw, 'item')
next(parsed) # first item
next(parsed) # second item
Inspect the third object without putting it all to memory. If I do
next(parsed)
again, all of the "data" array will be read to memory and turned into a dict, and I want to avoid it.Inspect the data array without loading it all to memory. If I didn't care about the other keys, I could do that:
parsed = ijson.items(res.raw, 'item.data.item') # iterator over data's items
The problem is, I need to do all of these on the same stream.
Ideally it would have been great to receive the third object as a file-like object that I can pass to ijson again, but that seems out of scope for that API.
I'm also fine with replacing ijson with a library that can do this better.