i have a Ubuntu laptop with 8 GB ram .and also have a 2 GB CSV file but when i use pandas method read_csv to load my data the ram is completely filled while there was 7 GB ram free . how does 2 GB file fill 7 GB ram ?
Asked
Active
Viewed 5,027 times
2
-
1Can you paste code to accompany your question? – Nov 09 '16 at 20:16
-
These SO threads maybe helpful http://stackoverflow.com/questions/19590966/memory-error-with-large-data-sets-for-pandas-concat-and-numpy-append http://stackoverflow.com/questions/17557074/memory-error-when-using-pandas-read-csv – Bharath Nov 09 '16 at 20:28
2 Answers
3
The reason you get this low_memory warning might be because guessing dtypes for each column is very memory demanding. Pandas tries to determine what dtype to set by analyzing the data in each column.
In case using 32-bit system : Memory errors happens a lot with python when using the 32bit version in Windows. This is because 32bit processes only gets 2GB of memory to play with by default.
Try this :
tp = pd.read_csv('file_name.csv', header=None, chunksize=1000)
df = pd.concat(tp, ignore_index=True)

harshil9968
- 3,254
- 1
- 16
- 26
-
yes .it was because of dtypes , and i converted some columns dtype as i was loading . thanks. – Shoobi Nov 11 '16 at 05:20
-
i have tried to upvote , but it is not displayed publicly because i have less than 15 reputation ;) – Shoobi Nov 16 '16 at 07:14
0
try to make use of chunksize parameter:
df = pd.concat((chunk for chunk in pd.read_csv('/path/to/file.csv', chunksize=10**4)),
ignore_index=True)

MaxU - stand with Ukraine
- 205,989
- 36
- 386
- 419
-
your first is horribly inefficient add the note: http://pandas.pydata.org/pandas-docs/stable/merging.html – Jeff Nov 09 '16 at 20:23
-
1every loop iteration you were making a copy of a bigger and bigger frame; instead append to a list and call concat once (as the current example does) – Jeff Nov 09 '16 at 20:29