0

I'm trying to use pandas_profiling to profile a table. It has around 20 columns most of them are float and almost 3 millions records.

I got the following error :

Traceback (most recent call last): File "V:\Python\prof.py", line 53, in if name == "main": main() File "V:\Python\prof.py", line 21, in main df = pd.read_sql(query, sql_conn) File "C:\Users\linus\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\sql.py", line 380, in read_sql chunksize=chunksize) File "C:\Users\linus\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\sql.py", line 1477, in read_query data = self._fetchall_as_list(cursor) File "C:\Users\linus\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\sql.py", line 1486, in _fetchall_as_ list result = cur.fetchall() MemoryError

I have tried with less record it worked.

Is there a way to bypass this error ? It looks like it is a memory limitation. Can we do that another way ? Or it is impossible with Python ?

Thanks for you help

Simon
  • 5,464
  • 6
  • 49
  • 85
Linus
  • 95
  • 1
  • 9

1 Answers1

0

If you are in the position to provide information so that we can replicate the error, we can resolve it. I would recommend opening an issue on the github page.

Disclose: I am co-author of this package.

Simon
  • 5,464
  • 6
  • 49
  • 85