I have a ~400MB json file that I'm looking to convert into a dataset of columns and rows. I'm using the below code to open the file in Jupyter Notebook, but receiving MemoryError:
with open(r'file_path', encoding="utf8") as f:
data = json.load(f)
df = pd.io.json.json_normalize(data['rows'])
Error:
MemoryError Traceback (most recent call last)
<ipython-input-2-79552ba3688b> in <module>()
1 with open(r'file_path', encoding="utf8") as f:
--> 2 data = json.load(f)
3 df = pd.io.json.json_normalize(data['rows'])
C:\Users\...\lib\json\__init__.py in load(fp, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
294
295 """
--> 296 return loads(fp.read(),
297 cls=cls, object_hook=object_hook,
298 parse_float=parse_float, parse_int=parse_int,
C:\Users\...\lib\codecs.py in decode(self, input, final)
319 # decode input (taking the buffer into account)
320 data = self.buffer + input
--> 321 (result, consumed) = self._buffer_decode(data, self.errors, final)
322 # keep undecoded input until the next call
323 self.buffer = data[consumed:]
MemoryError:
With a 300KB file this code works perfectly.
I tried using 32- and 64-bit python and my [windows] computer has 8GB Ram.
Any ideas in how to open the file into a data set?
Thanks, Naz