1

I have a ~400MB json file that I'm looking to convert into a dataset of columns and rows. I'm using the below code to open the file in Jupyter Notebook, but receiving MemoryError:

with open(r'file_path', encoding="utf8") as f:
     data = json.load(f)
df = pd.io.json.json_normalize(data['rows'])

Error:

MemoryError                               Traceback (most recent call last)
<ipython-input-2-79552ba3688b> in <module>()
      1 with open(r'file_path', encoding="utf8") as f:
  --> 2     data = json.load(f)
      3 df = pd.io.json.json_normalize(data['rows'])

      C:\Users\...\lib\json\__init__.py in load(fp, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
      294 
      295     """
  --> 296     return loads(fp.read(),
      297         cls=cls, object_hook=object_hook,
      298         parse_float=parse_float, parse_int=parse_int,

      C:\Users\...\lib\codecs.py in decode(self, input, final)
      319         # decode input (taking the buffer into account)
      320         data = self.buffer + input
  --> 321         (result, consumed) = self._buffer_decode(data, self.errors, final)
      322         # keep undecoded input until the next call
      323         self.buffer = data[consumed:]

MemoryError: 

With a 300KB file this code works perfectly.

I tried using 32- and 64-bit python and my [windows] computer has 8GB Ram.

Any ideas in how to open the file into a data set?

Thanks, Naz

n4zy
  • 13
  • 4
  • 1
    First step: get 64-bit Python, so you aren't limited by your address space. – user2357112 Apr 06 '17 at 18:04
  • 32-bit can only address 4GB (at most). – cdarke Apr 06 '17 at 18:07
  • Downloaded the 64-bit version, still same issue. – n4zy Apr 06 '17 at 18:37
  • Possible duplicate of [Memory errors and list limits?](http://stackoverflow.com/questions/5537618/memory-errors-and-list-limits) – ivan_pozdeev Apr 06 '17 at 19:00
  • The json file has ~200k rows when converting to a table, shouldn't a win64 with 8GB ram be sufficient to load the 400MB json file? I'm wondering if I should parse the file into chunks. – n4zy Apr 06 '17 at 19:24
  • nvm the 64-bit python worked!! I had to restart my jupyter-notebook from the Anaconda 64-bit download. Thanks all! – n4zy Apr 06 '17 at 19:47

0 Answers0