I'm trying to parse and sift through a very big JSON file, containing tweet metadata of 9gb size. That's why I'm using ijson since this was the one most recommended by the community for such files. Still pretty new at it but I rigged up this function which should store values to a list based on certain conditions. While looping through the different JSONs, it's showing me the following error:
parse error: unallowed token at this point in JSON text
sitive": false, "lang": "en"}, {"created_at": "Thu Mar 19 1
(right here) ------^
I'm not sure what I need to change for this to work. I've got this file after using the Twarc library to hydrate tweets. I'm attaching my sample code below. Did anybody ever encounter this before?
Sample Code:
import ijson
with open(march_20_tweets_path, 'rb') as input_file:
jsonobj = ijson.items(input_file, 'item', multiple_values=True)
jsons = (o for o in jsonobj if o['place'] is not None) #error shows here
for tweet in jsons:
#extracting and storing values