I am trying to get a 4Gb Csv into a dataframe. I got the memory error.
I checked the chunksize
solution but still have the problem.
I have also heard that it could be related to a python version but I am running with anaconda python 3.6 64 bits (at least it is what it is supposed to be)
Here is my function code :
import panda as pd
def dataFrameCreationCsv(path):
print('Loading Csv file...')
if file.isfile(path):
# dataFrameCsv=pd.read_csv(path,encoding='latin1',delimiter='|') MEMORY ISSUES ON LARGE SCALE DATASET
tp = pd.read_csv(path, encoding='latin1', delimiter='|', iterator=True, chunksize=1000) # gives TextFileReader, which is iterable with chunks of 1000 rows.
dfCsv = pd.concat(tp, ignore_index=True)
return dfCsv
else:
print('wrong path used')