I'm trying to use pandas_profiling to profile a table. It has around 20 columns most of them are float and almost 3 millions records.
I got the following error :
Traceback (most recent call last): File "V:\Python\prof.py", line 53, in if name == "main": main() File "V:\Python\prof.py", line 21, in main df = pd.read_sql(query, sql_conn) File "C:\Users\linus\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\sql.py", line 380, in read_sql chunksize=chunksize) File "C:\Users\linus\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\sql.py", line 1477, in read_query data = self._fetchall_as_list(cursor) File "C:\Users\linus\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\sql.py", line 1486, in _fetchall_as_ list result = cur.fetchall() MemoryError
I have tried with less record it worked.
Is there a way to bypass this error ? It looks like it is a memory limitation. Can we do that another way ? Or it is impossible with Python ?
Thanks for you help