I am trying to parse huge JSON file (around 20GB). Trying to read one line at a time (each line is a JSON object) and extract the required details.
Example:
JSON file data looks like the one shown below
{
{a: [], b: [], c: [], d: [],e: []},
{a: [], b: [], c: [], d: [],e: []},
.....,
{a: [], b: [], c: [], d: [],e: []},
}
Snippet to parse:
count = 0;
with open(fileName) as fp:
try:
for line in fp:
data_local = json.loads(line)
count = count + 1
#access the data_local["a"]
except:
print "Error found" , count , len(data_local["a"])
Error Message (when "except block" not used):
Traceback (most recent call last):
File "./xyzFile", line 606, in <module>
for line in fp:
SystemError: Negative size passed to PyString_FromStringAndSize
Output (when "except" block" is used)
Error found 65 5392287
Found something similar on stack overflow but that didn't help. Tried to debug by catching the exception. It throws the error after reading 65th JSON objects (lines). Each JSON object is huge(in size and no of values)
Any lead on this would be appreciated.
Thanks